llvm-project
0928f46c - [MLIR][GPU] Ensure all lanes in cluster have final reduction value (#165764)

Commit
4 days ago
[MLIR][GPU] Ensure all lanes in cluster have final reduction value (#165764) This is a fix for a cluster size of 32 when the subgroup size is 64. Previously, only lanes [16, 32) u [48, 64) contained the correct clusterwise reduction value. This PR adds a swizzle instruction to broadcast the correct value down to lanes [0, 16) u [32, 48).
Author
Parents
Loading