llama.cpp
SOTA 2-bit quants - part 2
#4856
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
11
Changes
View On
GitHub
SOTA 2-bit quants - part 2
#4856
ggerganov
merged 11 commits into
master
from
ik/iq2_2.31bpw
iq2_xs: basics
3569fa3f
iq2_xs: this should have been in the basics
9f21b82e
iq2_xs: CUDA and scalar CPU works
9b6e38d8
iq2_xs: WIP Metal
0aacd551
iq2_xs: Metal now works
55e2cae8
iq2_xs: working, but dog slow, ARM_NEON dot product
ff49d876
iq2_xs: better ARM_NEON dot product
52ea3f79
iq2_xs: AVX2 dot product - 19.5 t/s
3198e94f
iq2_xs: faster AVX2 dit product
8299b03a
iq2_xs: had forgotten to delete iq2-data.h
a1610b05
ggerganov
added
high priority
ggerganov
added
need feedback
Add llama enum for IQ2_XS
9bfcb16f
ggerganov
approved these changes on 2024-01-11
ggerganov
merged
49662cbe
into master
1 year ago
mofosyne
added
Tensor Encoding Scheme
mofosyne
added
Review Complexity : High
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
Assignees
No one assigned
Labels
high priority
need feedback
Review Complexity : High
Tensor Encoding Scheme
Milestone
No milestone
Login to write a write a comment.
Login via GitHub