llama.cpp
IQ3_S: multiplier based code book
#5867
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
24
Changes
View On
GitHub
IQ3_S: multiplier based code book
#5867
ikawrakow
wants to merge 24 commits into
master
from
ik/iq3_s_multiplier
Trying IQ3_S without a lookup table
9c752ff0
iq3_s(multiplier): use SIMD also in dequantize
1cc7cb2b
WIP
4c21c826
iq3_s_multiplier: CUDA and AVX2 works
160aceca
WIP
e43e81a5
WIP
0fe9cd48
iq3_s_mult: ARM_NEON works - 13 t/s
bf90920f
iq3_s_mult: Metal works - slower than lookup
3000e0ac
iq3_s_mult: quantization tuning
fe3c20b2
iq3_s_mult: alternative multiplier / bit twidling
726aed30
iq3_s_mult: ifdef'd slow / fast versions
b6402fa7
iq3s_mult: ARM and Metal
5b9c8785
iq3s_mult: quantization tuning
8b713a98
iq3_s_mult: another alternative multiplier
dbe98dfe
iq3_s_mult: play with blocks of 16
f4cb4eac
iq3_s_mult: back to blocks of 32
e5e72562
iq3_s_mult: also CUDA
f2c2bd6b
iq3_s_mult: scalar dot product
b48bf8b4
ikawrakow
added
demo
iq3_s_mult_shuffle: mult + shuffle based codebook
b5874822
ggerganov
commented on 2024-03-04
iq3_s_mult_shuffle: works on ARM_NEON and Metal
a6a263b9
iq3_s_mult: remove SLOW_MULT option
b1d753be
iq3_s_mult_shuffle: use new multiplier and cleanup
6d15da1e
iq3_s_mult_shuffle: use lookup table on CUDA
93034df7
iq3_s_mult_shuffle: use lookup table on Metal
31cecc87
ikawrakow
force pushed
from
1b6dce31
to
31cecc87
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
Assignees
No one assigned
Labels
demo
Milestone
No milestone
Login to write a write a comment.
Login via GitHub