PR #1514 ROCm AWQ support

ROCm AWQ support #1514

Narsil merged 27 commits into main from rocm-awq-support

fix exllama overflows

6cb6020e

awq fallback to exllama

12c1f545

post process exllama model

5766c55b

add triton fallback to awq

8acbcb31

fix missing g_idx and eventual overflow in triton kernel

0b5b8587

Narsil commented on 2024-02-01

revert changes

8665ab07

adapt awq weights to exllama/gptq kernels

fb59c562

typing

bcdb02e4

pass g_idx instead of changing triton kernel

994ed8e1

none g_idx

af2c589c

log message

cda5751b

fix exllama overflows

461dd6f1

awq fallback to exllama

75086526

post process exllama model

aa2014fc

add triton fallback to awq

3963074c

fix missing g_idx and eventual overflow in triton kernel

3ceeb858

revert changes

212fdfff

adapt awq weights to exllama/gptq kernels

8074c404

typing

646ab282

pass g_idx instead of changing triton kernel

bbe5bede

none g_idx

76834c99

log message

2629193e

Narsil force pushed from cda5751b to 2629193e 1 year ago

Narsil dismissed these changes on 2024-02-08

Updating the tests.

04d38a83

Narsil dismissed their stale review via 04d38a83 1 year ago

Merge branch 'rocm-awq-support' of https://github.com/huggingface/tex…

e29fb799

generate g_idx only for triton kernel

bc157af9

Update llama gptq.

a76821e0

Better error message on non rocm.

326f8e30

Narsil merged a4e58016 into main 1 year ago

Narsil deleted the rocm-awq-support branch 1 year ago

Reviewers

Narsil

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

text-generation-inference ROCm AWQ support #1514 Merged

ROCm AWQ support #1514

text-generation-inference
ROCm AWQ support
#1514

Merged