DeepSpeed
8d901bfc
- fix the gradient scale for when zero is not enabled
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
1 year ago
fix the gradient scale for when zero is not enabled
References
#4530 - Fix the sequence-parallelism for the dense model architecture
Author
Reza Yazdani
Parents
066644d7
Files
1
deepspeed/runtime
engine.py
Loading