llama.cpp
model : avoid ggml_cont_3d for fused QKV weights
#15662
Merged

model : avoid ggml_cont_3d for fused QKV weights #15662

ggerganov merged 7 commits into master from gg/model-avoid-cont3d
ggerganov
ggerganov ggerganov marked this pull request as ready for review 60 days ago
ggerganov model : avoid ggml_cont_3d for fused QKV weights
bb1202b2
ggerganov kv-cache : make cpy_k and cpy_v implementation more readable
85a5ea36
ggerganov cont : add comments
3dec397b
ggerganov ggerganov force pushed from f15d515e to 3dec397b 60 days ago
ggerganov cont : minor fix [no ci]
c62c354f
ggerganov cont : one more fix
1efa9e8a
ggerganov cont : clarity
d6be191d
CISC
CISC approved these changes on 2025-09-08
ggerganov kv-cache : require contiguous heads of k_cur and v_cur
60d6e7c6
CISC
ggerganov
CISC
ggerganov
CISC
ggerganov ggerganov merged cf0e3ba1 into master 60 days ago
ggerganov ggerganov deleted the gg/model-avoid-cont3d branch 60 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone