llama.cpp
llama : use n_embd_head_v instead of n_embd_head_k when reshaping kqv
#7327
Merged

Loading