llvm-project
ea43a308 - [AMDGPU] Vectorize more 16 bit shuffles (#90648)

Commit
1 year ago
[AMDGPU] Vectorize more 16 bit shuffles (#90648) In the case of larger vectors, we should still prefer the vectorized version (i.e. shufflevector vs extract/insert chains). In arithmetic chains, vectorization results in chains of packed math instructions (as opposed to unpack/repack & scalarized arithmetic): https://godbolt.org/z/c5onaf6G5 In chains with PHIs, vectorization again removes the unnecessary pack / repack code around BBs: https://godbolt.org/z/vz7zYzvhs
Author
Parents
Loading