models : fix the attn_factor for mistral3 graphs + improve consistency #17945
models : fix the attn_factor for mistral3 graphs
1df2e908
ngxson
approved these changes
on 2025-12-11
cont : rework attn_factor correction logic
59b9e36f
cont : make deepseek2 consistent
45930c97
ggerganov
force pushed
from
0ca55b61
to
45930c97
188 days ago
cont : add TODO
45875df2
ggerganov
marked this pull request as ready for review 188 days ago
cont : special-case DSv2
06eb8e86
ggerganov
changed the title models : fix the attn_factor for mistral3 graphs models : fix the attn_factor for mistral3 graphs + improve consistency 188 days ago
cont : revert Mistral 3 Large changes
01b77b57
ngxson
commented
on 2025-12-12
cont : fix DS2 to use the original attn_factor
7320a2dc
ngxson
approved these changes
on 2025-12-12
cont : minor comments [no ci]
d6477e14
ggerganov
merged
7bed317f
into master 188 days ago
ggerganov
deleted the gg/mistral-fix-attn-factor branch 188 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub