ucvtf (#141480)

Commit

3 days ago

[AArch64] Optimize vector fmul(sitofp/uitofp, 1/2^N) -> scvtf/ucvtf (#141480) When a vector integer-to-float conversion is followed by a multiply with a reciprocal power-of-two constant, we can fold both operations into a single SCVTF or UCVTF instruction with a fixed-point shift operand. For example, `fmul(sitofp(v2i32 x), <0.5, 0.5>)` becomes `scvtf.2s v0, v0, #1`. This is a reworked version with several improvements over the original submission: - Rewrite the C++ operand matcher to share implementation with the existing `SelectCVTFixedPointVec` (MOVIshift, FMOV, and DUP handling with correct truncation for f16) - Add `uitofp`/`ucvtf` patterns via a `CVTFRecipPat` multiclass - Add full GlobalISel support (`GIComplexOperandMatcher` + renderer) Supported vector types: `v2f32`, `v4f32`, `v2f64`, `v4f16`, `v8f16`. Fixes #94909

References

#141480 - [AArch64] Optimize vector fmul(sitofp/uitofp, 1/2^N) -> scvtf/ucvtf

Author

jph-13

Parents

7b43dcd7

llvm-project fbac55b8 - [AArch64] Optimize vector fmul(sitofp/uitofp, 1/2^N) -> scvtf/ucvtf (#141480)

llvm-project
fbac55b8 - [AArch64] Optimize vector fmul(sitofp/uitofp, 1/2^N) -> scvtf/ucvtf (#141480)