llama.cpp
27ad57a6
- Metal: faster Q4_0 and Q4_1 matrix x vector kernels (#2212)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
Metal: faster Q4_0 and Q4_1 matrix x vector kernels (#2212) * 3-5% faster Q4_0 on Metal * 7-25% faster Q4_1 on Metal * Oops, forgot to delete the original Q4_1 kernel --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
References
#2212 - Metal: faster Q4_0 and Q4_1 matrix x vector kernels
Author
ikawrakow
Parents
32c54116
Loading