DeepSpeed
52907a66
- stage3.py: do not scale if gradient_predivide_factor is 1.0 (#3630)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
stage3.py: do not scale if gradient_predivide_factor is 1.0 (#3630) this change also aligns with the logic before reduce_scatter_coalesced Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
References
#3630 - stage3.py: do not scale if gradient_predivide_factor is 1.0
Author
guoyejun
Parents
49a73549
Loading