pytorch
35f3feca - [RPC Framework] Supporting reading the input from the remote worker (#56943)

Commit
3 years ago
[RPC Framework] Supporting reading the input from the remote worker (#56943) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56943 If the module is placed on a CUDA device, then all the CPU tensors in `args` and `kwargs` will also be implicitly moved to the same CUDA device to run forward. Currently still need to move the forward output from CUDA device back to CPU, until: 1) Process group RPC backend is completely deprecated, and we always use TensorPipe RPC backend; 2) A device map is explicitly provided to TensorPipe RPC backend. These steps will be done in a separate PR. #Original PR issue: https://github.com/pytorch/pytorch/issues/51670 ghstack-source-id: 127457584 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device_script buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule buck test mode/dev-nosan //caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test -- --exact 'caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test - test_load_di_parts (caffe2.torch.fb.training_toolkit.applications.sparse_nn.batch_distributed_inference.tests.batch_distributed_inference_test.BatchDistributedInferenceTest)' Reviewed By: wanchaol Differential Revision: D27934791 fbshipit-source-id: de27e27b905db83cc52800e63684fc6c942e9dc7
Author
Yi Wang
Parents
Loading