llvm-project
b4b819ce - [MLIR][NVVM] Add Op for TMA Store with reduction (#118853)

Commit
292 days ago
[MLIR][NVVM] Add Op for TMA Store with reduction (#118853) PR #116854 adds intrinsics for TMA Store with reduction. This patch adds an NVVM Dialect Op for the same. * Lit tests are added to verify the lowering to LLVM intrinsics and invalid cases. * The common verifier method is updated to handle im2col modes without offsets. This helps Ops like TMA Store, TMA StoreReduce etc. * The nvvmir.mlir test file is already large. So, this patch adds the tests for this Op in a new file under a separate "nvvm/" directory. [mlir/test/Target/LLVMIR/"nvvm"/tma_store_reduce.mlir] PTX Spec reference: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-reduce-async-bulk-tensor Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
Author
Parents
Loading