DeepSpeed
0bb0cc80
- Use zero-tensors for missing gradients to avoid size mismatch
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
5 years ago
Use zero-tensors for missing gradients to avoid size mismatch
References
#545 - Fix unbalanced gradients bug in ZeRO-2 gradient accumulation
Author
tjruwase
Parents
9de21b72
Loading