llama.cpp
3be11510
- ggml-quants : use a max-heap for linear quants like Q3_K
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
272 days ago
ggml-quants : use a max-heap for linear quants like Q3_K Slightly faster than the previous method.
References
#12557 - ggml-quants : weighted rounding algorithms with cumulative search
Author
compilade
Parents
30ad9c28
Loading