Megatron-DeepSpeed
fc8f813d - dynamically discovered layer norm weights / refactor

Commit
3 years ago
dynamically discovered layer norm weights / refactor
Author
Parents
Loading