DeepSpeed
cfc6ed37
- bf16_optimizer: fixes to different grad acc dtype (#6485)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
bf16_optimizer: fixes to different grad acc dtype (#6485) - fix step function to cast to FP32 before step in case of different gradient accumulation data type - remove redundatn function initialize_optimizer_states()
References
#6485 - bf16_optimizer: fixes to different grad acc dtype
Author
nelyahu
Parents
9b7fc545
Loading