DeepSpeed
69f3ce48
- Add weight copy to base mlp ln, seperate norms from tensors in mlp/attn print statements
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
Add weight copy to base mlp ln, seperate norms from tensors in mlp/attn print statements
Author
lekurile
Committer
molly-smith
Parents
ac3eee13
Loading