pytorch
233e4ebd - Implement autograd functions for c10d communication operations (#40762)

Commit
4 years ago
Implement autograd functions for c10d communication operations (#40762) Summary: Closes https://github.com/pytorch/pytorch/issues/40702, Fixes https://github.com/pytorch/pytorch/issues/40690 Currently wip. But I would appreciate some feedback. Functions should be double-differentiable. Contrary to https://github.com/pytorch/pytorch/blob/b35cdc5200af963e410c0a25400fd07f30b89bca/torch/nn/parallel/_functions.py This PR generates list of tensors instead of aggregating the received data in a single tensor. Is this behavior correct? Thanks! Pull Request resolved: https://github.com/pytorch/pytorch/pull/40762 Reviewed By: glaringlee Differential Revision: D24758889 Pulled By: mrshenli fbshipit-source-id: 79285fb4b791cae3d248f34e2aadb11c9ab10cce
Author
Emilio Castillo
Parents
Loading