llama.cpp
5582c49c - gemma : more consistent attention scaling for v2 and v3 (#13951)

Commit
101 days ago
gemma : more consistent attention scaling for v2 and v3 (#13951) * gemma : fix attn scale for 27B * cont : apply scale before attn * cont : consistent attention scaling
Author
Parents
Loading