llvm-project
b2bf017a - [X86] X86FixupInstTuning - prefer VPBLENDD to VPBLENDW shuffles on AVX2+ targets (#144269)

Commit

200 days ago

[X86] X86FixupInstTuning - prefer VPBLENDD to VPBLENDW shuffles on AVX2+ targets (#144269) On many Intel AVX2 targets (Haswell+), VPBLENDD has notably better throughput than VPBLENDW - and the remaining Intel/AMD targets have no preference. This patch replaces VPBLENDW shuffles if the shuffle mask can be safely widened from vXi16 to vXi32 and that the scheduler model doesn't consider it a regression (I haven't found any target where this is true, but we should retain the model check). Noticed while working on #142972 where VMOVSS nodes were regressing to VPBLENDW nodes during domain switching.

References

#144269 - [X86] X86FixupInstTuning - prefer VPBLENDD to VPBLENDW shuffles on AVX2+ targets

Author

RKSimon

Parents

eddab9b7

llvm-project b2bf017a - [X86] X86FixupInstTuning - prefer VPBLENDD to VPBLENDW shuffles on AVX2+ targets (#144269)

llvm-project
b2bf017a - [X86] X86FixupInstTuning - prefer VPBLENDD to VPBLENDW shuffles on AVX2+ targets (#144269)