jax
3f7d9106 - [Mosaic GPU] Fix predicate for `tcgen05_mma` lowering.

Commit

82 days ago

[Mosaic GPU] Fix predicate for `tcgen05_mma` lowering. From https://docs.nvidia.com/cuda/parallel-thread-execution/#tcgen05-mma-instructions-mma: > The instruction `tcgen05.mma` has single thread semantics, unlike the collective instructions `mma.sync` or `wgmma.mma_async`. So, a single thread issuing the `tcgen05.mma` will result in the initiation of the whole matrix multiply and accumulate operation. This is consistent with LANE lowering semantics. PiperOrigin-RevId: 880831793

References

#35719 - [Mosaic GPU] Fix predicate for `tcgen05_mma` lowering.

Author

allanrenucci

Committer

Google-ML-Automation

Parents

9334db25

jax 3f7d9106 - [Mosaic GPU] Fix predicate for `tcgen05_mma` lowering.

jax
3f7d9106 - [Mosaic GPU] Fix predicate for `tcgen05_mma` lowering.