pytorch
58d12eb7 - Allow to specify a set of device for CUDAFuture (#56515)

Commit

3 years ago

Allow to specify a set of device for CUDAFuture (#56515) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56515 In https://github.com/pytorch/pytorch/pull/56405 we finally found a solution to support RPC remote user functions that created/used CUDA tensors on devices that were not used by their arguments, by defining a "bounding set" of devices when constructing the agent and allowing all functions to freely use any of those devices. We had the same exact problem with the callbacks of CUDAFuture, and in this PR I'm adopting the same exact solution: I allow to specify a set of devices when constructing a CUDAFuture, and then every callback is allowed to use any of those devices. (These devices will also be propagated to child futures). I'm also making ProcessGroupNCCL pass these devices. I can't yet do it for TensorPipeAgent, until #56405 lands. ghstack-source-id: 127261552 Test Plan: Added a test for this later in the stack. Reviewed By: mrshenli Differential Revision: D27861067 fbshipit-source-id: 8ab2c9d06a514c0407a7e96abc3704e8d5c5dc09

Author

Committer

facebook-github-bot

Parents

d6a25a58

pytorch 58d12eb7 - Allow to specify a set of device for CUDAFuture (#56515)

pytorch
58d12eb7 - Allow to specify a set of device for CUDAFuture (#56515)