Megatron-DeepSpeed
bdd75f18 - 1B3 parameter setup + flos counting

Commit
4 years ago
1B3 parameter setup + flos counting
Author
Parents
Loading