Better support for string formatting (#7418)
Fixes https://github.com/python/mypy/issues/1444
Fixes https://github.com/python/mypy/issues/2639
Fixes https://github.com/python/mypy/issues/7114
This PR does three related things:
* Fixes some corner cases in `%` formatting
* Tightens `bytes` vs `str` interactions (including those that are technically not errors)
* Adds type checking for `str.format()` calls
This also fixes few issues discovered by this PR during testing (notably a bug in f-string transformation), and adds/extends a bunch of docstring and comments to existing code. The implementation is mostly straightforward, there are few hacky things, but IMO nothing unreasonable.
Here are few comments:
* It was hard to keep the approach to `str.format()` calls purely regexp-based (as for `%` formatting), mostly because the former can be both nested and repeated (And we must support nested formatting because this is how we support f-strings). CPython itself uses a custom parser but it is huge, so I decided to have a mixed approach. This way we can keep code simple while still maintaining practically one-to-one compatibility with runtime behavior (the error messages are sometimes different however).
* This causes few hundreds of errors internally, I am not sure what to do with these. Most popular ones are:
- Unicode upcast (`'%s' % u'...'` results in `unicode`, not `str`, on Python 2)
- Using `'%s' % some_bytes` and/or `'{:s}'.format(some_bytes)` on Python 3
- Unused arguments in `str.format()` call (probably because at runtime they are silently ignored)
* I added new error code for string interpolation/formatting and used it everywhere in old and new code. Potentially we might split out the `str` vs `bytes` errors into a separate error code, because technically they are not type errors, just suspicious/dangerous code.