llama.cpp
IQ3_S: multiplier based code book
#5867
Open

IQ3_S: multiplier based code book #5867

ikawrakow wants to merge 24 commits into master from ik/iq3_s_multiplier
ikawrakow
Trying IQ3_S without a lookup table
9c752ff0
iq3_s(multiplier): use SIMD also in dequantize
1cc7cb2b
WIP
4c21c826
iq3_s_multiplier: CUDA and AVX2 works
160aceca
WIP
e43e81a5
WIP
0fe9cd48
iq3_s_mult: ARM_NEON works - 13 t/s
bf90920f
iq3_s_mult: Metal works - slower than lookup
3000e0ac
iq3_s_mult: quantization tuning
fe3c20b2
iq3_s_mult: alternative multiplier / bit twidling
726aed30
iq3_s_mult: ifdef'd slow / fast versions
b6402fa7
iq3s_mult: ARM and Metal
5b9c8785
iq3s_mult: quantization tuning
8b713a98
iq3_s_mult: another alternative multiplier
dbe98dfe
iq3_s_mult: play with blocks of 16
f4cb4eac
iq3_s_mult: back to blocks of 32
e5e72562
iq3_s_mult: also CUDA
f2c2bd6b
iq3_s_mult: scalar dot product
b48bf8b4
ikawrakow ikawrakow added demo
ikawrakow
sorasoras
iq3_s_mult_shuffle: mult + shuffle based codebook
b5874822
ikawrakow
ggerganov
ggerganov commented on 2024-03-04
iq3_s_mult_shuffle: works on ARM_NEON and Metal
a6a263b9
iq3_s_mult: remove SLOW_MULT option
b1d753be
iq3_s_mult_shuffle: use new multiplier and cleanup
6d15da1e
iq3_s_mult_shuffle: use lookup table on CUDA
93034df7
iq3_s_mult_shuffle: use lookup table on Metal
31cecc87
ikawrakow ikawrakow force pushed from 1b6dce31 to 31cecc87 1 year ago
ikawrakow
Green-Sky
sorasoras
ikawrakow

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone