pytorch
5ab8afe4 - [Model Averaging] Support disabling post-local gradient sync (#76723)

Commit
2 years ago
[Model Averaging] Support disabling post-local gradient sync (#76723) I find that sometimes disabling intra-subgroup gradient allreduce can still give a satisfying accuracy for some cases, so better to make such gradient averaging configurable. This does not take into account the saving in the communication of allreducing gradients. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76723 Approved by: https://github.com/rohan-varma
Author
Committer
Parents
Loading