llama.cpp
2b3a665d - llama : use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 (#4996)

Commit
1 year ago
llama : use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 (#4996) Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Author
Parents
Loading