llvm-project
108380da - [mlir][nvvm] Add `cp.async.bulk.tensor.shared.cluster.global.multicast` (#72429)

Commit

1 year ago

[mlir][nvvm] Add `cp.async.bulk.tensor.shared.cluster.global.multicast` (#72429) This PR introduce `cp.async.bulk.tensor.shared.cluster.global.multicast` Op in NVVM dialect. It loads data using TMA data from global memory to shared memory of multiple CTAs in the cluster. It resolves #72368

References

#72429 - [mlir][nvvm] Improve `cp.async.bulk.tensor.shared.cluster.global` for multicast

Author

grypp

Parents

25d0f9fc

llvm-project 108380da - [mlir][nvvm] Add `cp.async.bulk.tensor.shared.cluster.global.multicast` (#72429)

llvm-project
108380da - [mlir][nvvm] Add `cp.async.bulk.tensor.shared.cluster.global.multicast` (#72429)