DeepSpeed
e801e6d7
- skipping redundant MoE optimizer state loading (#4120)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
skipping redundant MoE optimizer state loading (#4120) Co-authored-by: Alexander Jipa <azzhipa@amazon.com>
References
#4120 - tolerating missing optimizer states for MoE [2nd attempt]
Author
Alexander Jipa
Parents
9894c06a
Loading