llama.cpp
9c5b594c
- iq3_s: another small ARM_NEON improvement
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
iq3_s: another small ARM_NEON improvement 10.7 -> 11.0 t/s. Using vmulq_s8 is faster than the xor - sub trick that works best on AVX2.
References
#5829 - IQ3_S improvements
Author
Iwan Kawrakow
Parents
1e949891
Loading