DeepSpeed
00c3a254 - Bug fix for norm calculation in absence of model parallel group (#551)

Commit
5 years ago
Bug fix for norm calculation in absence of model parallel group (#551) In the absence of a model parallel group, model_parallel_allreduce should not do any reduction. This commit fixes the bug which was doing a model parallel allreduce across world group when model parallel group is None
Author
Parents
Loading