llama.cpp
cf0e3ba1 - model : avoid ggml_cont_3d for fused QKV weights (#15662)

Commit
59 days ago
model : avoid ggml_cont_3d for fused QKV weights (#15662) * model : avoid ggml_cont_3d for fused QKV weights ggml-ci * kv-cache : make cpy_k and cpy_v implementation more readable ggml-ci * cont : add comments ggml-ci * cont : minor fix [no ci] * cont : one more fix * cont : clarity ggml-ci * kv-cache : require contiguous heads of k_cur and v_cur ggml-ci
Author
Parents
Loading