pytorch
13bd5992 - Remove `finalize_bucket_sparse` from DDP (#40130)

Commit

4 years ago

Remove `finalize_bucket_sparse` from DDP (#40130) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40130 The sparse gradients for the model and the tensor that is used to perform allreduce in DDP are essentially the same and have the same storage. As a result, once allreduce is done, the sparse gradients are automatically updated and unlike dense gradients we don't need to assign the bucket's contents back to the grad. In addition to this, I've also added a test for distributed autograd to ensure it works correctly for sparse gradients. I discovered `finalize_bucket_sparse` was redundant as part of this test since it passed without any changes needed to `finalize_bucket_sparse` which only looked at the `.grad` field. ghstack-source-id: 106090063 Test Plan: waitforbuildbot Differential Revision: D22080004 fbshipit-source-id: 493ce48b673f26b55dffd6894a3915dc769839f6

Author

pritamdamania

Committer

facebook-github-bot

Parents

7e82382a

pytorch 13bd5992 - Remove `finalize_bucket_sparse` from DDP (#40130)

pytorch
13bd5992 - Remove `finalize_bucket_sparse` from DDP (#40130)