pytorch
b12f34e8 - update rpc tensorpipe logic for sparse tensors (#62960)

Commit View On GitHub

Commit

3 years ago

update rpc tensorpipe logic for sparse tensors (#62960) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62960 A bug was filed a few years ago for sending sparse tensor over rpc #30807. This pr updates rpc/tensorpipe logic for CUDA sparse tensors. During the serialization process, the pickler.cpp implementation breaks down the sparse tensor into two tensors and metadata. torch/csrc/distributed/rpc/tensorpipe_agent.cpp needs to be updated because it does not have logic sparse tensors. It pushes a single device for a sparse tensor. This is wrong because after the sparse tensor has been serialized, there will be two tensors. The second tensor will not have a device. This will cause the second tensor to have the wrong target device. tensorpipe_utils.cpp needs to be updated because deserialization happens after the data is received on the target pipe. This takes the two tensors and metadata sent and rebuilds the sparse tensor. There will be two tpDescriptors but only one tensor after deserialization. The logic is updated to verify the sparse tensor is on the correct device using the first tpDescriptor. This pr also updates ivalue.cpp and ivalue.h to support more paths for Sparse COO tensors. I tested these changes by adding sparse tests to rpc_test.py and dist_autograd_test.py. Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D30717285 Pulled By: gcramer23 fbshipit-source-id: daee9a56764550f56b131f9dd8e74e23113d6714

References

#65112 - [LTC] Merge master

Author

gcramer23

Committer

facebook-github-bot

Parents

32a93c24

pytorch b12f34e8 - update rpc tensorpipe logic for sparse tensors (#62960)

Commit

pytorch
b12f34e8 - update rpc tensorpipe logic for sparse tensors (#62960)