llvm
bd224ebf - [CUDA] Use different strategies for Managed vs Device memory cross-device copies

Commit
33 days ago
[CUDA] Use different strategies for Managed vs Device memory cross-device copies For cross-device USM memcpy operations: - Check memory type using CU_POINTER_ATTRIBUTE_MEMORY_TYPE - For Managed Memory (CU_MEMORYTYPE_UNIFIED/USM Shared): use cuMemcpyAsync and let CUDA runtime handle page migration automatically - For Device Memory (CU_MEMORYTYPE_DEVICE/USM Device): use cuMemcpyPeerAsync with explicit source and destination contexts This approach leverages CUDA's Unified Memory subsystem for Managed Memory while using explicit peer copies for Device Memory.
Author
Parents
Loading