pytorch
00997969 - [CUDA Pinned Memory] [Retry] Alternative implementation of pinned memory allocator focusing on multi-threaded scalability (#69299)

Commit

2 years ago

[CUDA Pinned Memory] [Retry] Alternative implementation of pinned memory allocator focusing on multi-threaded scalability (#69299) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69299 https://github.com/pytorch/pytorch/pull/68906 + https://github.com/pytorch/pytorch/pull/68749 plugged one correctness hole (non-blocking copies of offset pinned memory tensors) while introducing another (non-blocking copies of pinned memory tensors with a non-standard DataPtr context). In this revision, we use both the tensor data pointer and context to attempt to identify the originating block in the pinned memory allocator. Test Plan: New unit tests added to cover the missing case previously. Reviewed By: yinghai Differential Revision: D32787087 fbshipit-source-id: 0cb0d29d7c39a13f433eb1cd423dc0d2a303c955 (cherry picked from commit 297157b1a13b5c75d860cac9eba4fe7fe1ad5e6f)

References

#72894 - Merge pytorch master into lazy_tensor_staging

Author

Andrew Tulloch

Committer

pytorchmergebot

Parents

ebeeee7b

pytorch 00997969 - [CUDA Pinned Memory] [Retry] Alternative implementation of pinned memory allocator focusing on multi-threaded scalability (#69299)

pytorch
00997969 - [CUDA Pinned Memory] [Retry] Alternative implementation of pinned memory allocator focusing on multi-threaded scalability (#69299)