jax
e05385d9 - [Pallas][Mosaic GPU] Rename for_tensor_core to orders_tensor_core

Commit
185 days ago
[Pallas][Mosaic GPU] Rename for_tensor_core to orders_tensor_core This name better reflects the difference in barrier semantics this flag causes. Unless set, nothing should be assumed about the relative ordering of tcgen05 ops and barriers. In particular, even if you await the completion of a tcgen05 op (e.g. a load) in one thread and signal another, when it completes its wait, you can't assume that the load really has been performed in its entirety unless the barrier you've used to synchronize those two threads has orders_tensor_core=True. To me this is a big usability issue in the PTX design. It's unsafe by default, and requires us to insert additional fences to indicate which synchronization primitives interact with the TensorCore. PiperOrigin-RevId: 778030790
Author
Parents
Loading