llama.cpp
5582c49c
- gemma : more consistent attention scaling for v2 and v3 (#13951)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
101 days ago
gemma : more consistent attention scaling for v2 and v3 (#13951) * gemma : fix attn scale for 27B * cont : apply scale before attn * cont : consistent attention scaling
References
#13951 - gemma : more consistent attention scaling for v2 and v3
Author
ggerganov
Parents
c9bbc779
Loading