llvm
15ef1e0d - [CUDA] Use cuMemPrefetchAsync for Managed Memory in urEnqueueUSMMemcpy

Commit
68 days ago
[CUDA] Use cuMemPrefetchAsync for Managed Memory in urEnqueueUSMMemcpy For CUDA Managed Memory (CU_MEMORYTYPE_UNIFIED), use prefetch hints instead of relying solely on automatic migration: 1. Prefetch destination to queue's device before copy 2. Perform cuMemcpyAsync 3. Subsequent kernel access on other device will trigger migration Also properly handle Device memory cross-device with cuMemcpyPeerAsync.
Author
Committer
Parents
Loading