metal : Q3_K speedup (#2995)

Commit

2 years ago

metal : Q3_K speedup (#2995) * Slightly faster Q3_K and Q5_K on metal * Another Q3_K speedup on metal Combined with previous commit, we are now +9.6% for TG. PP is not affected as this happens via the matrix multiplication templates. * Slowly progressing on Q3_K on metal We are now 13% faster than master * nother small improvement for Q3_K on metal --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

References

#2995 - metal: Q3_K speedup

Author

ikawrakow

Parents

e64f5b55

llama.cpp ba7ffbb2 - metal : Q3_K speedup (#2995)

llama.cpp
ba7ffbb2 - metal : Q3_K speedup (#2995)