DeepSpeed
792ce6f9
- tiled expert parameter groups and combined grad-upscale and optimizer step in zero-2
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
3 years ago
tiled expert parameter groups and combined grad-upscale and optimizer step in zero-2
Author
siddharth9820
Parents
828ab718
Loading