pytorch
13dbb77b - [RPC Framework] Enable RemoteModule to directly send GPU tensors over the wire on TensorPipe RPC backend if a device map is provided (#57288)

Commit

3 years ago

[RPC Framework] Enable RemoteModule to directly send GPU tensors over the wire on TensorPipe RPC backend if a device map is provided (#57288) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57288 If the device map provided by RemoteModue is not empty, then TensorPipe RPC backend can support directly sending GPU tensors over the wire. Also add pybind of `_get_device_map`. The changes in unit test setup is separated out as a follow-up PR, as currently it breaks some tests in `distributed/rpc/test_faulty_agent.py`. Still need to fix test_load_di_parts in `torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test`. Currently an early return is used to bypass this test failure. #Original PR issue: https://github.com/pytorch/pytorch/issues/51670 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device_script buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule -j 1 CAUTION: This one actually fails and now it is bypassed. See FIXME in `_remote_forward`. buck test mode/dev-nosan caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test -- test_load_di_parts Reviewed By: wanchaol Differential Revision: D28021672 fbshipit-source-id: a89245dc35e1d9479811ec6f98d9f34116837d79

Author

Yi Wang

Committer

facebook-github-bot

Parents

20eac093

pytorch 13dbb77b - [RPC Framework] Enable RemoteModule to directly send GPU tensors over the wire on TensorPipe RPC backend if a device map is provided (#57288)

pytorch
13dbb77b - [RPC Framework] Enable RemoteModule to directly send GPU tensors over the wire on TensorPipe RPC backend if a device map is provided (#57288)