llama.cpp
39e3a429
- iq3_s: somewhat faster AVX2 dot product
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
iq3_s: somewhat faster AVX2 dot product On Ryzen a 7950X TG-128 increases to 16 t/s from 15.5 t/s using 16 threads. For 8 threads it is 13.85 t/s vs 11.75 t/s. PP-512 increases to 28.5 t/s from 23.8 t/s.
References
#5829 - IQ3_S improvements
Author
Iwan Kawrakow
Parents
3ab8b3a9
Loading