DeepSpeed
af512117 - Samyamr/zero offload correctness (#359)

Commit
5 years ago
Samyamr/zero offload correctness (#359) * fixing gradient accumulation for zero offload * Bug fixes. ZeRO Stage 1,2 and Offload all produce the same loss with gradient accumulation step of 2
Author
Parents
Loading