fix exllama overflows
6cb6020e
awq fallback to exllama
12c1f545
post process exllama model
5766c55b
add triton fallback to awq
8acbcb31
fix missing g_idx and eventual overflow in triton kernel
0b5b8587
Narsil
commented
on 2024-02-01
revert changes
8665ab07
adapt awq weights to exllama/gptq kernels
fb59c562
typing
bcdb02e4
pass g_idx instead of changing triton kernel
994ed8e1
none g_idx
af2c589c
log message
cda5751b
fix exllama overflows
461dd6f1
awq fallback to exllama
75086526
post process exllama model
aa2014fc
add triton fallback to awq
3963074c
fix missing g_idx and eventual overflow in triton kernel
3ceeb858
revert changes
212fdfff
adapt awq weights to exllama/gptq kernels
8074c404
typing
646ab282
pass g_idx instead of changing triton kernel
bbe5bede
none g_idx
76834c99
log message
2629193e
Narsil
force pushed
from
cda5751b
to
2629193e
1 year ago
Narsil
dismissed these changes
on 2024-02-08
Updating the tests.
04d38a83
Narsil
dismissed their stale review
via 04d38a83
1 year ago
Merge branch 'rocm-awq-support' of https://github.com/huggingface/tex…
e29fb799
generate g_idx only for triton kernel
bc157af9
Update llama gptq.
a76821e0
Better error message on non rocm.
326f8e30
Narsil
merged
a4e58016
into main 1 year ago
Narsil
deleted the rocm-awq-support branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub