pytorch
41f62b17 - Fix DDP join() API in the case of model.no_sync() (#44427)

Commit
4 years ago
Fix DDP join() API in the case of model.no_sync() (#44427) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44427 Closes https://github.com/pytorch/pytorch/issues/44425 DDP join API currently does not work properly with `model.no_sync()`, see https://github.com/pytorch/pytorch/issues/44425 for details. This PR fixes the problem via the approach mentioned in the issue, namely scheduling an allreduce that tells joined ranks whether to sync in the backwards pass or not. Tests are added for skipping gradient synchronization for various `sync_interval`s. ghstack-source-id: 111786479 Reviewed By: pritamdamania87 Differential Revision: D23609070 fbshipit-source-id: e8716b7881f8eee95e3e3499283e716bd3d7fe76
Author
Parents
Loading