llama.cpp
Add attention and final logit soft-capping, update scaling factor to Gemma2
#8197

Merged

Add attention and final logit soft-capping, update scaling factor to Gemma2 #8197

abetlen merged 10 commits into master from add-gemma2-soft-capping

Add attention and final logit softcapping.

4d3f17b4

github-actions added python

fix

d3d3c4eb

slaren commented on 2024-06-28

Add custom add_ functions

d1137c20

Disable flash attention for Gemma2

f4424c15

slaren commented on 2024-06-28

Update src/llama.cpp

3a247181

slaren approved these changes on 2024-06-28

ngxson commented on 2024-06-28

mofosyne added Review Complexity : Medium

Merge branch 'master' of github.com:ggerganov/llama.cpp into add-gemm…

8edf73a7

Add default value for attention and final logit softcap value

bb715992

Merge branch 'add-gemma2-soft-capping' of github.com:ggerganov/llama.…

6f2464e3

ngxson approved these changes on 2024-06-29

ggerganov approved these changes on 2024-06-29

Add custom kq scaling from Gemma2Attention

a8942790

abetlen changed the title ~~Add attention and final logit soft-capping to Gemma2~~ Add attention and final logit soft-capping, custom scaling factor to Gemma2 1 year ago

Remove custom pre attention scaling and use computed value instead.

51f0bd50

abetlen changed the title ~~Add attention and final logit soft-capping, custom scaling factor to Gemma2~~ Add attention and final logit soft-capping, update scaling factor to Gemma2 1 year ago

abetlen merged 1c5eba6f into master 1 year ago

slaren commented on 2024-06-30

eran-medan commented on 2024-07-02

Reviewers

ggerganov

ngxson

slaren

eran-medan

bartowski1182

Assignees

No one assigned

Labels

python Review Complexity : Medium

Milestone

No milestone

llama.cpp Add attention and final logit soft-capping, update scaling factor to Gemma2 #8197 Merged

Add attention and final logit soft-capping, update scaling factor to Gemma2 #8197

llama.cpp
Add attention and final logit soft-capping, update scaling factor to Gemma2
#8197

Merged