jax
bb454d20 - [Mosaic GPU] Fix a race in test_remote_async_copy

Commit
205 days ago
[Mosaic GPU] Fix a race in test_remote_async_copy Previously the kernel in the test would exit without observing the completion of the current GPU's and the peer GPU's writes to each other's memory, which occasionally led to failing correctness checks. We fix the race by adding a system memory barrier to observe the completion of this GPU's write to the peer GPU's memory, and a semaphore that allows to observe the peer's GPU write to this GPU's memory.
Author
Parents
Loading