llama.cpp
gemma : more consistent attention scaling for v2 and v3
#13951
Merged

Commits
  • gemma : fix attn scale for 27B
    ggerganov committed 103 days ago
  • cont : apply scale before attn
    ggerganov committed 103 days ago
  • cont : consistent attention scaling
    ggerganov committed 102 days ago
Loading