pytorch
df00c636 - [Model Averaging] Skip model averaging for the first K steps (#61207)

Commit
3 years ago
[Model Averaging] Skip model averaging for the first K steps (#61207) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61207 Model averager now must be combined with post-localSGD DDP communication hook. It will skip model averaging for the first K steps, because post-localSGD communication hook will run global gradient averaging during this phase. Proposal: https://github.com/pytorch/pytorch/issues/59699 ghstack-source-id: 133371335 Test Plan: buck test mode/dev-nosan caffe2/test/distributed:distributed_nccl_fork -- test_periodic_model_averager Reviewed By: pritamdamania87 Differential Revision: D29523738 fbshipit-source-id: 3fa9611046e1c0afa4bda78aa3ba200fa2a5fa4b
Author
Yi Wang
Parents
Loading