PR #4856 SOTA 2-bit quants - part 2

SOTA 2-bit quants - part 2 #4856

ggerganov merged 11 commits into master from ik/iq2_2.31bpw

iq2_xs: basics

3569fa3f

iq2_xs: this should have been in the basics

9f21b82e

iq2_xs: CUDA and scalar CPU works

9b6e38d8

iq2_xs: WIP Metal

0aacd551

iq2_xs: Metal now works

55e2cae8

iq2_xs: working, but dog slow, ARM_NEON dot product

ff49d876

iq2_xs: better ARM_NEON dot product

52ea3f79

iq2_xs: AVX2 dot product - 19.5 t/s

3198e94f

iq2_xs: faster AVX2 dit product

8299b03a

iq2_xs: had forgotten to delete iq2-data.h

a1610b05

ggerganov added high priority

ggerganov added need feedback

Add llama enum for IQ2_XS

9bfcb16f

ggerganov approved these changes on 2024-01-11

ggerganov merged 49662cbe into master 2 years ago

mofosyne added Tensor Encoding Scheme

mofosyne added Review Complexity : High

Reviewers

ggerganov

Assignees

No one assigned

Labels

high priority need feedback Review Complexity : High Tensor Encoding Scheme

Milestone

No milestone