llvm-project
24838316 - [MLIR][NVVM] Extend TMA Bulk Copy Op (#140232)

Commit
309 days ago
[MLIR][NVVM] Extend TMA Bulk Copy Op (#140232) This patch extends the non-tensor TMA Bulk Copy Op (from shared_cta to global) with an optional byte mask operand. This mask helps selectively copy a particular byte to the destination. * lit tests are added to verify the lowering to the intrinsics. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
Author
Parents
Loading