llama.cpp
4d76a5f4
- Faster Q3_K implementation on Metal (#2307)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
Faster Q3_K implementation on Metal (#2307) * Faster Q3_K on Metal * Additional Q3_K speedup on Metal * Q3_K for QK_K = 64 * Better Q3_K for QK_K = 64 21.6 ms/t -> 21.1 ms/t --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
References
#2307 - Faster Q3_K implementation on Metal
Author
ikawrakow
Parents
0db14fef
Loading