llama.cpp
cf0e3ba1 - model : avoid ggml_cont_3d for fused QKV weights (#15662)

Commit

177 days ago

model : avoid ggml_cont_3d for fused QKV weights (#15662) * model : avoid ggml_cont_3d for fused QKV weights ggml-ci * kv-cache : make cpy_k and cpy_v implementation more readable ggml-ci * cont : add comments ggml-ci * cont : minor fix [no ci] * cont : one more fix * cont : clarity ggml-ci * kv-cache : require contiguous heads of k_cur and v_cur ggml-ci

References

#15662 - model : avoid ggml_cont_3d for fused QKV weights

Author

ggerganov

Parents

d413dca0

llama.cpp cf0e3ba1 - model : avoid ggml_cont_3d for fused QKV weights (#15662)

llama.cpp
cf0e3ba1 - model : avoid ggml_cont_3d for fused QKV weights (#15662)