DeepSpeed
149f60b7
- stage3.py: do not scale if gradient_predivide_factor is 1.0 (#3630)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
stage3.py: do not scale if gradient_predivide_factor is 1.0 (#3630) this change also aligns with the logic before reduce_scatter_coalesced Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Author
guoyejun
Committer
molly-smith
Parents
003b62cc
Loading