transformers
84556e05
- remove q_norm/k_norm sharding and gather after projections
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
8 days ago
remove q_norm/k_norm sharding and gather after projections Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Author
dacorvo
Parents
ff7c92d7
Loading