llama.cpp
Quantization improvements for k_quants
#2707
Merged

Quantization improvements for k_quants #2707

ikawrakow merged 12 commits into master from ik/better_q234_k
ikawrakow
Improve LLaMA-2 2-, 3- and 4-bit quantization
f26f9ef4
Minor 4-bit quantization improvement
77aea721
Some more fine tuning
ec9cb753
Adding make_qkx2_quants
4f8dcb16
Another minor improvement
e9f1340c
Q2_K improvement
1c1f985b
Iterating
404e43cc
Revert Q5_K back to make_qkx1_quants
9f78d4cd
Better Q6_K
e2af308c
make_qkx2_quants is better for Q5_K after all
b7063393
Fix after rebasing on master
35a0b974
Fix for changed tensor names
fdf73db5
Green-Sky
klosax
ggerganov
ggerganov approved these changes on 2023-08-22
ikawrakow
ggerganov
KerfuffleV2
IgnacioFDM
TheBloke
ikawrakow
Green-Sky Green-Sky changed the title Quantization imrovements for k_quants Quantization improvements for k_quants 2 years ago
ikawrakow
IgnacioFDM
ikawrakow ikawrakow merged bac66994 into master 2 years ago
ikawrakow ikawrakow deleted the ik/better_q234_k branch 2 years ago
cebtenzzre
Green-Sky
mirek190
cebtenzzre
KerfuffleV2
ikawrakow
TheBloke
klosax
ikawrakow
klosax
KerfuffleV2
klosax
TheBloke
KerfuffleV2
IgnacioFDM
ikawrakow
KerfuffleV2
ikawrakow
TheBloke
ikawrakow
IgnacioFDM
cebtenzzre
Dampfinchen
ikawrakow
KerfuffleV2
ggerganov
KerfuffleV2
ikawrakow
cebtenzzre

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone