DeepSpeed
69f3ce48 - Add weight copy to base mlp ln, seperate norms from tensors in mlp/attn print statements

Commit
2 years ago
Add weight copy to base mlp ln, seperate norms from tensors in mlp/attn print statements
Author
Committer
Parents
Loading