DeepSpeed
54c06872
- stage3: efficient compute of scaled_global_grad_norm (#5256)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
1 year ago
stage3: efficient compute of scaled_global_grad_norm (#5256) using torch.norm instead of inefficient for loop --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
References
lekurile/ds_chat_test_54c06872
#5256 - stage3: efficient compute of scaled_global_grad_norm
Author
nelyahu
Parents
7b5b0660
Files
1
deepspeed/runtime/zero
stage3.py
Loading