llama.cpp
IQ4_NL: 4-bit non-linear quants with blocks of 32
#5590

Merged

Commits

iq4_nl: squash commits for easier rebase

Iwan Kawrakow committed 1 year ago
iq4_nl: Fix after merging with master

Iwan Kawrakow committed 1 year ago
iq4_nl: another fix after merging with master

Iwan Kawrakow committed 1 year ago
Use IQ4_NL instead of Q4_K when using k-quants is not possible

Iwan Kawrakow committed 1 year ago
Fix typo that makes several tests fail

Iwan Kawrakow committed 1 year ago
It was the ggml_vdotq thing missed inside the brackets

Iwan Kawrakow committed 1 year ago