llvm-project
5efce739 - [compiler-rt][ARM] Optimized mulsf3 and divsf3 (#168394)

Commit
20 days ago
[compiler-rt][ARM] Optimized mulsf3 and divsf3 (#168394) (Reland of #161546, fixing three build and test issues) This commit adds optimized assembly versions of single-precision float multiplication and division. Both functions are implemented in a style that can be assembled as either of Arm and Thumb2; for multiplication, a separate implementation is provided for Thumb1. Also, extensive new tests are added for multiplication and division. These implementations can be removed from the build by defining the cmake variable COMPILER_RT_ARM_OPTIMIZED_FP=OFF. Outlying parts of the functionality which are not on the fast path, such as NaN handling and underflow, are handled in helper functions written in C. These can be shared between the Arm/Thumb2 and Thumb1 implementations, and also reused by other optimized assembly functions we hope to add in future.
Author
Parents
Loading