Megatron-DeepSpeed
8c1ed225
- do all_reduce op.AVG directly
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
3 years ago
do all_reduce op.AVG directly
References
#272 - sync layer norms
#292 - a branch combining layer-norm-auto-sync and ds_ckpt_reshape
Author
stas00
Parents
a9fb317e
Loading