pytorch
df00c636 - [Model Averaging] Skip model averaging for the first K steps (#61207)

Commit

3 years ago

[Model Averaging] Skip model averaging for the first K steps (#61207) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61207 Model averager now must be combined with post-localSGD DDP communication hook. It will skip model averaging for the first K steps, because post-localSGD communication hook will run global gradient averaging during this phase. Proposal: https://github.com/pytorch/pytorch/issues/59699 ghstack-source-id: 133371335 Test Plan: buck test mode/dev-nosan caffe2/test/distributed:distributed_nccl_fork -- test_periodic_model_averager Reviewed By: pritamdamania87 Differential Revision: D29523738 fbshipit-source-id: 3fa9611046e1c0afa4bda78aa3ba200fa2a5fa4b

Author

Yi Wang

Committer

facebook-github-bot

Parents

0f6876d7

pytorch df00c636 - [Model Averaging] Skip model averaging for the first K steps (#61207)

pytorch
df00c636 - [Model Averaging] Skip model averaging for the first K steps (#61207)