llama.cpp
ba7ffbb2 - metal : Q3_K speedup (#2995)

Commit
2 years ago
metal : Q3_K speedup (#2995) * Slightly faster Q3_K and Q5_K on metal * Another Q3_K speedup on metal Combined with previous commit, we are now +9.6% for TG. PP is not affected as this happens via the matrix multiplication templates. * Slowly progressing on Q3_K on metal We are now 13% faster than master * nother small improvement for Q3_K on metal --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Author
Parents
Loading