llvm-project
146a9193 - [Clang][Lexer] Reland "Detect SSE4.2 availability at runtime in fastParseASCIIIdentifier" (#175452)

Commit
1 day ago
[Clang][Lexer] Reland "Detect SSE4.2 availability at runtime in fastParseASCIIIdentifier" (#175452) This PR reopens #171914 after it was merged then reverted by #174946 because of compilation failures. This change attempts to maximize usage of the SSE fast path in `fastParseASCIIIdentifier`. If the binary is compiled with SSE4.2 enabled, or if we are not compiling for x86, then the behavior is the exact same, ensuring we have no regressions. Otherwise, we compile both the SSE fast path and the scalar loop. At runtime, we check if SSE4.2 is available and dispatch to the right function by using the `target` attribute. If it _is_ available, this allows a net performance improvement. Otherwise, there's a very slight but negligible regression... I believe that's perfectly reasonable for a non-SSE4.2-supporting processor. I checked locally on an old x86 processor with QEMU to ensure this doesn't break compatibility. The benchmark results are available at [llvm-compile-time-tracker](https://llvm-compile-time-tracker.com/compare.php?from=f88d060c4176d17df56587a083944637ca865cb3&to=d5485438edd460892bf210916827e0d92fc24065&stat=instructions%3Au).
Parents
Loading