Fix kq_scale for the attention layers of PLaMo2 #14892
Fix dimensions for expand
7baf4fd1
Change dimensions to copy states to cache
e39bc092
Fix the default value for plamo2 conversion
bd4d2e1c
Fix scale given to build_attn
c475203c
CISC
approved these changes
on 2025-07-26
CISC
commented
on 2025-07-26
Update src/llama-model.cpp
75f0a0d7
Update src/llama-model.cpp
429639d0
Update src/llama-model.cpp
60a705de
CISC
merged
1dc9614e
into master 139 days ago
mitmul
deleted the mitmul/fix-build-attn-scale-plamo2 branch 139 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub