pytorch
b619d37b - [Gradient Compression] Simplify the implementation of error feedback and warm-start (#50981)

Commit

3 years ago

[Gradient Compression] Simplify the implementation of error feedback and warm-start (#50981) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50981 Since vanilla allreduce will to be applied in the first few iterations, bucket rebuilding process will not affect caching per-variable tensors. Previously the cached tensors used for error feedback and warm-up need to be rebuilt later, because their corresponding input tensors' shape will be changed after the bucket rebuild process. Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202 ghstack-source-id: 120617971 Test Plan: real run Reviewed By: rohan-varma Differential Revision: D26034418 fbshipit-source-id: e8744431c7f3142d75b77b60110e6861c2ff5c14

Author

Yi Wang

Committer

facebook-github-bot

Parents

00d4ec84

pytorch b619d37b - [Gradient Compression] Simplify the implementation of error feedback and warm-start (#50981)

pytorch
b619d37b - [Gradient Compression] Simplify the implementation of error feedback and warm-start (#50981)