Megatron-DeepSpeed
fc8f813d
- dynamically discovered layer norm weights / refactor
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
3 years ago
dynamically discovered layer norm weights / refactor
References
#274 - Sync 4 layer norms - bf16, fp32, optimizer states on restart
Author
stas00
Parents
8f2ea60b
Loading