llama.cpp
4be44b7c - iq1_s: use IQ2_XXS for attn_output

Commit
1 year ago
iq1_s: use IQ2_XXS for attn_output At a cost of 0.04 extra bpw this gives a big improvement in PPL.
Author
Iwan Kawrakow
Committer
Iwan Kawrakow
Parents
Loading