llama.cpp
SOTA 2-bit quants - part 2
#4856
Merged

SOTA 2-bit quants - part 2 #4856

ggerganov merged 11 commits into master from ik/iq2_2.31bpw
ikawrakow
iq2_xs: basics
3569fa3f
iq2_xs: this should have been in the basics
9f21b82e
iq2_xs: CUDA and scalar CPU works
9b6e38d8
iq2_xs: WIP Metal
0aacd551
iq2_xs: Metal now works
55e2cae8
iq2_xs: working, but dog slow, ARM_NEON dot product
ff49d876
iq2_xs: better ARM_NEON dot product
52ea3f79
iq2_xs: AVX2 dot product - 19.5 t/s
3198e94f
iq2_xs: faster AVX2 dit product
8299b03a
iq2_xs: had forgotten to delete iq2-data.h
a1610b05
ggerganov ggerganov added high priority
ggerganov ggerganov added need feedback
ikawrakow
JianbangZ
ikawrakow
sakura-umi
JianbangZ
WiseFarAI
ikawrakow
JianbangZ
ggerganov
Add llama enum for IQ2_XS
9bfcb16f
ikawrakow
ikawrakow
ggerganov
ikawrakow
ggerganov
ggerganov approved these changes on 2024-01-11
ggerganov ggerganov merged 49662cbe into master 1 year ago
ikawrakow
Cran-May
ikawrakow
JianbangZ
JianbangZ
Cran-May
Artefact2
ikawrakow
Artefact2
ggerganov
Artefact2
ggerganov
iddar
iakat
ikawrakow
sorasoras
mofosyne mofosyne added Tensor Encoding Scheme
mofosyne mofosyne added Review Complexity : High

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone