llama.cpp
ggml-quants : weighted rounding algorithms with cumulative search
#12557
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
13
Changes
View On
GitHub
Commits
ggml-quants : improve IQ4_NL, IQ4_XS, and Q3_K
compilade
committed
310 days ago
ggml-quants : better and faster make_qkxs_quants
compilade
committed
310 days ago
ggml-quants : improve imatrix behavior for TQ1_0, TQ2_0, Q4_0, Q5_0
compilade
committed
310 days ago
ggml-quants : improve TQ2_0 imatrix
compilade
committed
296 days ago
ggml-quants : remove some commented code
compilade
committed
288 days ago
ggml-quants : faster exhaustive IQ4_NL rounding with k_heap
compilade
committed
288 days ago
ggml-quants : use a max-heap for linear quants like Q3_K
compilade
committed
283 days ago
ggml-quants : use qkxh in more places
compilade
committed
282 days ago
ggml-quants : use a max-heap for TQ1_0 and TQ2_0 quantization
compilade
committed
281 days ago
ggml-quants : remove slower qsort-based cumulative search
compilade
committed
281 days ago
Merge branch 'master' into compilade/optimal-rounding
compilade
committed
281 days ago
ggml-quants : restore Q2_K use of make_qp_quants
compilade
committed
281 days ago
ggml-quants : fix some edge cases in make_qkxh_nl_quants
compilade
committed
280 days ago
Loading