llvm-project
88bd5659 - VectorCombine: Improve the insert/extract fold in the narrowing case (#168820)

Commit

2 days ago

VectorCombine: Improve the insert/extract fold in the narrowing case (#168820) Keeping the extracted element in a natural position in the narrowed vector has two beneficial effects: 1. It makes the narrowing shuffles cheaper (at least on AMDGPU), which allows the insert/extract fold to trigger. 2. It makes the narrowing shuffles in a chain of extract/insert compatible, which allows foldLengthChangingShuffles to successfully recognize a chain that can be folded. There are minor X86 test changes that look reasonable to me. The IR change for AVX2 in llvm/test/Transforms/VectorCombine/X86/extract-insert-poison.ll doesn't change the assembly generated by `llc -mtriple=x86_64-- -mattr=AVX2` at all.

References

#168820 - VectorCombine: Improve the insert/extract fold in the narrowing case

Author

nhaehnle

Parents

bb1bfb1c

llvm-project 88bd5659 - VectorCombine: Improve the insert/extract fold in the narrowing case (#168820)

llvm-project
88bd5659 - VectorCombine: Improve the insert/extract fold in the narrowing case (#168820)