llvm
7fe41dd9 - Simplify cross-device sync: use only cuEventSynchronize

Commit
22 days ago
Simplify cross-device sync: use only cuEventSynchronize Previous approach with barrier events was corrupting CUDA state. cuEventRecord is asynchronous, so destroying the event immediately after recording caused undefined behavior. Now use simple host synchronization: - cuEventSynchronize blocks CPU until event completes - Subsequent enqueues to target stream happen after event completion - No need for barrier events or additional synchronization
Author
Parents
Loading