DeepSpeed
77b649d1
- Clip gradients of last stage tied weights
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
3 years ago
Clip gradients of last stage tied weights
References
#1801 - bf16+pipeline parallelism
Author
tjruwase
Parents
19198688
Loading