mypy
8f2d0f12 - [mypyc] Add `librt.strings.toupper` and `tolower` codepoint primitives (#21553)

Commit
28 days ago
[mypyc] Add `librt.strings.toupper` and `tolower` codepoint primitives (#21553) 6th PR of #21418. This PR introduces two `i32 -> i32` case-conversion helpers, alongside the existing classifiers. **The constraint to flag**: A single i32 holds one codepoint, but some Unicode case mappings expand to multiple e.g `'ß'.upper()` becomes `'SS'`, `'fi'.upper()` becomes `'FI'` etc. For those inputs the primitive _returns the input unchanged_; This is the same split CPython makes between `Py_UNICODE_TOUPPER` (codepoint) and `str.upper()` (string), with the former returning the **first codepoint** of the expansion. Users needing full Unicode case conversion should call `s.upper()` / `s.lower()` on the string, for which we already have mypyc primitives (#20948). For ASCII benchmarks, the codepoint primitives are ~5x faster than their `str` counterparts, avoiding the 1-char allocation.
Author
Parents
Loading