llama.cpp
Add attention and final logit soft-capping, update scaling factor to Gemma2
#8197
Merged

Add attention and final logit soft-capping, update scaling factor to Gemma2 #8197

abetlen merged 10 commits into master from add-gemma2-soft-capping
abetlen
abetlen Add attention and final logit softcapping.
4d3f17b4
github-actions github-actions added python
abetlen fix
d3d3c4eb
N8python
slaren
slaren commented on 2024-06-28
abetlen Add custom add_ functions
d1137c20
abetlen Disable flash attention for Gemma2
f4424c15
slaren
slaren commented on 2024-06-28
N8python
abetlen Update src/llama.cpp
3a247181
slaren
slaren
slaren
slaren approved these changes on 2024-06-28
theo77186
abetlen
slaren
abetlen
ngxson
ngxson commented on 2024-06-28
ngxson
ngxson
mofosyne mofosyne added Review Complexity : Medium
abetlen Merge branch 'master' of github.com:ggerganov/llama.cpp into add-gemm…
8edf73a7
abetlen Add default value for attention and final logit softcap value
bb715992
abetlen Merge branch 'add-gemma2-soft-capping' of github.com:ggerganov/llama.…
6f2464e3
ngxson
ngxson approved these changes on 2024-06-29
ggerganov
ggerganov approved these changes on 2024-06-29
qnixsynapse
abetlen Add custom kq scaling from Gemma2Attention
a8942790
abetlen
abetlen abetlen changed the title Add attention and final logit soft-capping to Gemma2 Add attention and final logit soft-capping, custom scaling factor to Gemma2 1 year ago
sinand99
arlo-phoenix
arlo-phoenix
abetlen Remove custom pre attention scaling and use computed value instead.
51f0bd50
abetlen abetlen changed the title Add attention and final logit soft-capping, custom scaling factor to Gemma2 Add attention and final logit soft-capping, update scaling factor to Gemma2 1 year ago
abetlen
qnixsynapse
abetlen abetlen merged 1c5eba6f into master 1 year ago
ddh0
qnixsynapse
slaren
slaren commented on 2024-06-30
eran-medan
eran-medan commented on 2024-07-02

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone