llama.cpp
1.5 bit quantization
#5453
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
15
Changes
View On
GitHub
1.5 bit quantization
#5453
ggerganov
merged 15 commits into
master
from
ik/iq1_s
ikawrakow
added
demo
ikawrakow
force pushed
from
5c4cb8dd
to
a4982831
1 year ago
ikawrakow
force pushed
from
9803f7ad
to
584b3692
1 year ago
iq1_s: WIP basics
80cd5bae
iq1_s: CUDA is working
a9d48e97
iq1_s: scalar CPU dot product
d94139bf
iq1_s: WIP AVX2 dot product - something is not right
592b3b26
Fix tests
5574533a
Fix shadow warnings
dc0b14be
Fix after merge with latest master
67e7c423
iq1_s: AVX2 finally works
2ffb05ac
iq1_s: ARM_NEON dot product. Works, but not very fast
77301492
iq1_s: better grid
307c5f61
iq1_s: use IQ2_XXS for attn_output
4be44b7c
ikawrakow
force pushed
from
584b3692
to
4be44b7c
1 year ago
iq1_s: Metal basics
020b548e
iq1_s: Metal works, but quite slow
425c6bbb
iq1_s: Tests
f604a179
iq1_s: slightly faster dot product
5c977221
ikawrakow
added
enhancement
ikawrakow
marked this pull request as ready for review
1 year ago
ikawrakow
requested a review
from
ggerganov
1 year ago
ggerganov
approved these changes on 2024-02-18
ggerganov
merged
bd2d4e39
into master
1 year ago
Artem-B
commented on 2024-02-18
mofosyne
added
Tensor Encoding Scheme
mofosyne
added
Review Complexity : High
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
Artem-B
Assignees
No one assigned
Labels
enhancement
demo
Review Complexity : High
Tensor Encoding Scheme
Milestone
No milestone
Login to write a write a comment.
Login via GitHub