llvm
2705b67a - [UR][L0v2] Fix sync bug in enqueueEventsWaitWithBarrier (#21251)

Commit
26 days ago
[UR][L0v2] Fix sync bug in enqueueEventsWaitWithBarrier (#21251) `ur_queue_immediate_out_of_order_t::enqueueEventsWaitWithBarrier` has a copy-paste bug where it waits for barrier events `N` times on the first (internal) command list, instead of waiting on the `N` command lists once each. This is likely a copy-paste error from the preceding call to `barrierFn`, that was not caught in testing or code review. The bug does not seem to reproduce on any released GPUs on Linux, it looks as-if waiting for any event on a single command-list blocks dispatch from every other command-list on all our current GPUs. However I did not investigate this deeply, because I believe this is a clear error on UR's side either way. The bug IS reproducible on an Intel internal simulator; this is how I caught it. I can provide more details on internal channels if desired. For reference below is the reproducer used. Tested on BMG and Panther Lake, where it passes both before and after the PR, and with the simulator where it fails before, but is fixed by this change. The reproducer also passes with the level zero V1 adapter on the simulated device. <details> <summary>Reproducer</summary> ```cpp int main(int argc, char *argv[]) { sycl::queue q; // Out of order! int tripCount = 200'000'000; if (argc > 1) tripCount = std::atoi(argv[1]); int *a = sycl::malloc_shared<int>(1, q); int *b = sycl::malloc_shared<int>(1, q); q.single_task([=] { float sum = 0; for (int i = 0; i < tripCount; ++i) sum += sycl::sqrt(float(i)); *a = (sum > 0); }); q.ext_oneapi_submit_barrier(); q.single_task([=] { *b = *a + 1; }); q.wait(); std::cout << "a: " << *a << ", b: " << *b << std::endl; if (*a != 1 || *b != 2) { std::cout << "Test failed!" << std::endl; return 1; } std::cout << "Test passed!" << std::endl; } ``` </details> I am unsure how a reasonable test might be written to cover this, please advise if that's desired. Fixes: https://github.com/intel/llvm/issues/20861
Author
Parents
Loading