llama.cpp
4d76a5f4 - Faster Q3_K implementation on Metal (#2307)

Commit

2 years ago

Faster Q3_K implementation on Metal (#2307) * Faster Q3_K on Metal * Additional Q3_K speedup on Metal * Q3_K for QK_K = 64 * Better Q3_K for QK_K = 64 21.6 ms/t -> 21.1 ms/t --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

References

#2307 - Faster Q3_K implementation on Metal

Author

ikawrakow

Parents

0db14fef

llama.cpp 4d76a5f4 - Faster Q3_K implementation on Metal (#2307)

llama.cpp
4d76a5f4 - Faster Q3_K implementation on Metal (#2307)