llama.cpp
Use Q4_K for attn_v for Q2_K_S when n_gqa >= 4
#4996
Merged

Loading