PR #4773 SOTA 2-bit quants - SemanticDiff

SOTA 2-bit quants #4773

ikawrakow merged 17 commits into master from ik/iq2_2.06bpw

ikawrakow

ggerganov

ggerganov added high priority

slaren

ikawrakow

ikawrakow force pushed to a12488bc 1 year ago

ikawrakow

JohannesGaessler

JohannesGaessler

JohannesGaessler commented on 2024-01-04

Dampfinchen

ikawrakow

Dampfinchen

ikawrakow

JianbangZ

ikawrakow

sakura-umi

sorasoras

Dampfinchen

ggerganov

ikawrakow

he29-net

ikawrakow

he29-net

iq2_xxs: basics

4af24881

iq2_xxs: scalar and AVX2 dot products

7ef63896

iq2_xxs: ARM_NEON dot product

7b72318e

iq2_xxs: WIP Metal

d383f00e

iq2_xxs: Metal dot product now works

dd296101

iq2_xxs: slighty faster dot product

1c96aa0d

iq2_xxs: slighty faster dot product

e211fadc

iq2_xxs: even faster Metal dot product

065cc8cb

iq2_xxs: dequantize CUDA kernel - fix conflict with master

06e6908a

iq2_xxs: quantized CUDA dot product (MMVQ)

82405219

iq2_xxs: slightly faster CUDA dot product

c19d0d09

iq2_xxs: add to llama ftype enum

fd42737c

iq2_xxs: fix MoE on Metal

47ae9b8f

Fix missing MMQ ops when on hipBLAS

61c04053

Fix bug in qequantize_row_iq2_xxs

7db967e8

ikawrakow

ikawrakow force pushed to 7db967e8 1 year ago

Fixing tests

5684d790

JianbangZ

ggerganov

ggerganov approved these changes on 2024-01-08

JohannesGaessler

JohannesGaessler approved these changes on 2024-01-08

PR suggestion

bad5f7f3

ikawrakow

ikawrakow merged dd5ae064 into master 1 year ago

ikawrakow

ikawrakow deleted the ik/iq2_2.06bpw branch 1 year ago

TheBloke

JianbangZ

TheBloke

Dampfinchen

ikawrakow

Dampfinchen

jxy

ggerganov

x4080

jxy

jxy

Ttl

x4080

x4080

JianbangZ

joseph777111

ikawrakow

JianbangZ

x4080

tsengalb99

ikawrakow

tsengalb99

mofosyne

mofosyne added Tensor Encoding Scheme

mofosyne

mofosyne added Review Complexity : High

afsara-ben

Login to write a write a comment.

Login via GitHub

Reviewers

JohannesGaessler

JohannesGaessler

ggerganov

ggerganov

Assignees

No one assigned

Labels

high priority Review Complexity : High Tensor Encoding Scheme

Milestone

No milestone