DeepSpeed
95aee34f
- fix for complete_grad_norm_calc in stage3
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
fix for complete_grad_norm_calc in stage3 place err tensor on the same device as inf_or_nan
References
lekurile/offload_fix_test
#5493 - re-introduce: stage3: efficient compute of scaled_global_grad_norm
Author
Nadav Elyahu
Parents
63a89be1
Loading