llvm
e5a40686 - Use cuMemcpyPeerAsync for cross-device USM copies in CUDA adapter

Commit
25 days ago
Use cuMemcpyPeerAsync for cross-device USM copies in CUDA adapter The CUDA adapter was using cuMemcpyAsync() for all USM memory copies, including cross-device copies. However, CUDA requires cuMemcpyPeerAsync() for peer-to-peer copies between different devices, even when P2P access is enabled via cuCtxEnablePeerAccess(). This change: - Detects cross-device copies by querying CU_POINTER_ATTRIBUTE_CONTEXT for both source and destination pointers - Uses cuMemcpyPeerAsync() when contexts differ (cross-device copy) - Falls back to cuMemcpyAsync() for same-device or host-device copies This fixes the urEnqueueKernelLaunchIncrementMultiDeviceTest which chains kernel launches and cross-device memcpy operations. Fixes: #19033
Author
Parents
Loading