text-generation-inference
Inference support for GPTQ (llama + falcon tested) + Quantization script
#438
Merged

Inference support for GPTQ (llama + falcon tested) + Quantization script #438

OlivierDehaene merged 19 commits into main from support_gptq
Narsil
gsaivinay
gsaivinay commented on 2023-06-09
[WIP] Inference support for GPTQ (llama at least)
9a12941b
Removing dead code.
92f85c96
Fixing the dockerfile (require triton + gcc for compiling).
0b585921
Narsil Typo.
da8ebf16
Adding quantization scripts.
5a727153
Functionning quantization script.
a0a194c3
Some fixes.
ae308f88
Re-enabling dim=dim in TensorParallelColumn because llama.
3fb8979a
Neox.
dadbbc27
Fixing few things
ffe8fc46
Fixing register bias + gptq_bits type.
ee1f94e6
Narsil Falcon
e5e552b4
Narsil Tiny fixes for falcon.
55cf4d25
Narsil Narsil force pushed from a6587a91 to 55cf4d25 2 years ago
Narsil No one saw that, therefore it didn't happen.
5de68637
OlivierDehaene
OlivierDehaene commented on 2023-06-14
Narsil Remove lots of dead code, move triton to hard requirement
732da694
Narsil Triton is actually a dependency of torch on linux.
054a3d09
Narsil Narsil marked this pull request as ready for review 2 years ago
Narsil Narsil changed the title [WIP] Inference support for GPTQ (llama at least) Inference support for GPTQ (llama + falcon tested) + Quantization script 2 years ago
Narsil Typo.
983c813f
jiyuanq
Narsil
0x1997
Narsil
0x1997
0x1997
Narsil
Narsil Santacoder GPTQ support (quantized model seems awful, not sure if it's
16d0fb04
jiyuanq
Narsil
jiyuanq
Narsil
jiyuanq
OlivierDehaene
OlivierDehaene commented on 2023-06-19
psinger
0x1997
Narsil
psinger
0x1997
Narsil
psinger
psinger
psinger commented on 2023-06-20
psinger
jgcb00
psinger
jgcb00
psinger
jgcb00
psinger
jgcb00
Narsil
psinger
Narsil
Narsil Apply suggestions from code review
a8aa688a
Narsil
jgcb00
Narsil
tienthanhdhcn
Narsil
TheBloke
flozi00
Narsil
flozi00
Narsil
flozi00
flozi00
OlivierDehaene OlivierDehaene merged aefde28b into main 2 years ago
OlivierDehaene OlivierDehaene deleted the support_gptq branch 2 years ago
ronald-d-rogers

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone