DeepSpeed
f93e22b3 - Correctness fix PP+ZeRO for gradient accumulation + updates from master (#1263)

Commit
4 years ago
Correctness fix PP+ZeRO for gradient accumulation + updates from master (#1263) Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Author
Parents
Loading