llama.cpp
gemma : more consistent attention scaling for v2 and v3
#13951

Merged

Commits

gemma : fix attn scale for 27B

ggerganov committed 103 days ago
cont : apply scale before attn

ggerganov committed 103 days ago
cont : consistent attention scaling

ggerganov committed 102 days ago

Loading