llvm-project
961d52f3 - [Clang] [Lexer] Detect SSE4.2 availability at runtime in fastParseASCIIIdentifier (#171914)

Commit
116 days ago
[Clang] [Lexer] Detect SSE4.2 availability at runtime in fastParseASCIIIdentifier (#171914) This change attempts to maximize usage of the SSE fast path in `fastParseASCIIIdentifier`. If compiling for x86, we compile both the SSE fast path and the scalar loop. At runtime, we check if SSE4.2 is available and dispatch to the right function by using the `target` attribute. If it _is_ available, this allows a net performance improvement. Otherwise, there's a very slight but negligible regression... I believe that's perfectly reasonable for a non-SSE4.2-supporting processor. If we are not compiling for x86, then the behavior is the exact same, ensuring we have no regressions. If the binary is compiled for x86 with SSE4.2 enabled, we still do a runtime check, but this has negligible impact ; furthermore, the point of the PR is that this is rarely the case. The benchmark results are available at [llvm-compile-time-tracker](https://llvm-compile-time-tracker.com/compare.php?from=f88d060c4176d17df56587a083944637ca865cb3&to=d5485438edd460892bf210916827e0d92fc24065&stat=instructions%3Au).
Parents
Loading