llvm-project
2fc07338 - [AArch64] Decompose FADD reductions with known zero elements (#167313)

Commit
3 days ago
[AArch64] Decompose FADD reductions with known zero elements (#167313) FADDV is matched into FADDPv4f32 + FADDPv2i32p but this can be relaxed when one element (usually the 4th) or more are known to be zero. Before: ``` movi d1, #0000000000000000 mov v0.s[3], v1.s[0] faddp v0.4s, v0.4s, v0.4s faddp s0, v0.2s ``` After: ``` mov s1, v0.s[2] faddp s0, v0.2s fadd s0, s0, s1 ``` When all of the elements are zero, the intrinsic now simply reduces into a constant instead of emitting two additions.
Author
Parents
Loading