SemanticDiff

pytorch
0d8a3610 - Multiple module outputs and multiple calls to backward (#19799)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

5 years ago

Multiple module outputs and multiple calls to backward (#19799) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19799 A module that returns multiple outputs and where the called may end up doing multiple calls to torch.autograd.backward did not work with DistributedDataParallel. It expected the first call to torch.autograd.backward to provide gradients for ALL parameters that expect gradients and were used in computing the module output. If you have outputs with disjoint autograd graphs it is fine to call torch.autograd.backward on both and fill in the module's parameter gradients in separate chunks. With this change we delay queuing the finalizer callback until we have marked all buckets as ready, instead of queueing it the first time we receive an autograd hook. This returns the current implementation to be functionally equivalent to the DistributedDataParallel implementation before #18953 was merged. Reviewed By: mrshenli Differential Revision: D15097045 fbshipit-source-id: 2df023319713bc31e29a8b45108c78e6593fccd4

Author

pietern

pietern

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading