PR #6398: Call cuStreamSynchronize if cudaMallocAsync allocator fails on allocation
Imported from GitHub PR https://github.com/openxla/xla/pull/6398
Call cuStreamSynchronize if cudaMallocAsync allocator fails, this can reduce the chance of OOM
Copybara import of the project:
--
2a03090982017395251572fd2f0e5adca2a902f9 by Shawn Wang <shawnw@nvidia.com>:
The sync allow the driver more option to find memory. So sometimes it can find memory available after a sync.
Merging this change closes #6398
COPYBARA_INTEGRATE_REVIEW=https://github.com/openxla/xla/pull/6398 from shawnwang18:shawnw/cudamallocasync_synchronization 2a03090982017395251572fd2f0e5adca2a902f9
PiperOrigin-RevId: 575197415