xla
Support int4 weight in quantized matmul/linear
#7235

Merged

Support int4 weight in quantized matmul/linear #7235

lsy323 merged 24 commits into master from lsiyuan/int4-quant-ops

add quantized layers per channel

f7c200af

enhance tests, clean up

f48c666a

add q ops to ci

65f6fcab

add README

c042e2fa

update readme

878d7e78

update readme

b4542372

initial commit for int4

e69627fe

add some tests

810f1049

use literal

b8ed810c

fix bad malloc

27acbbb7

add a subchannel test

7c52bf92

add tests

9fd7caa4

add TPU numerical check

fa29ba27

refactor

9c47f637

format

256a2616

merge

059053be

lsy323 marked this pull request as ready for review 2 years ago

update docl

5c4c7f0e

rename to cast_int4

03f46f14

lsy323 force pushed from 43301176 to 03f46f14 2 years ago

remove dup files

11be78b3

format

3a1d83f7

JackCaoG requested a review from

JackCaoG 2 years ago

remove comment

62a0b17a

remove comment

5fe2f09e

JackCaoG commented on 2024-06-10

remove unused pack unpack and test

9addde9c

lsy323 requested a review from

JackCaoG 2 years ago

JackCaoG approved these changes on 2024-06-10

lsy323 added quantization

fix import

77c61a6c

lsy323 merged ac371fb8 into master 2 years ago

miladm assigned

miladm 2 years ago

miladm assigned

lsy323 2 years ago

miladm unassigned

miladm 2 years ago

lsy323 deleted the lsiyuan/int4-quant-ops branch 1 year ago

Reviewers

JackCaoG

Assignees

lsy323

Labels

quantization

Milestone

No milestone

xla Support int4 weight in quantized matmul/linear #7235 Merged

Support int4 weight in quantized matmul/linear #7235

xla
Support int4 weight in quantized matmul/linear
#7235

Merged