Add GPTQ Quantization #1216
v1 test draft
a1a33efb
code runs but outputs gibberish.
6ca0f123
draft v1.1
0fe030bb
remove duplicate
71129bb8
remove dep to transformers and cleaning
ac7023c1
Add serialization and loading
4abb9b8c
Clean code and doc
7150a973
add flexibility
24728047
remove triton
88dbe0e4
remove some dep with transformers
90ec342d
add testing
c7e49a00
make style
110c8c1c
add accelerate flag
f64632f5
handle device placement
ed9b743e
make style
f65a9793
Apply suggestions
7720b364
add doc in data.py
437329a6
apply suggestion for utils file
cfe62390
remove multiple output
3254d6e1
fix Optional
939e4ab5
Apply suggestions from code review
e39f5b7f
remove useless check
f8a25e21
fix doc and style
9afdbb48
fix name
e404bde5
replace catcher by prefoward hook
89d18d69
update doctstring for true_sequential
7ac898a2
apply suggestion
e34d960f
Fix import
d18226ae
Add docstring for tests
754cd011
move args
6d10f73f
fix typo
bba3516d
fix cpu offload and tokenizer
e6622403
fix typo
58e3e7b1
fix offload cpu
3633d43d
modify attribute
1df19a14
more explicit error
28f4ce49
dataset optional
a019885c
add tqdm bar instead
d2720996
style
28acd3ca
add doc
ae77ffa9
replace by tqdm.auto
c7453090
Merge remote-tracking branch 'upstream/main' into add-gptq-marc
98591ab7
change model
088f56fa
add CI
4b019ead
Apply suggestions from code review
49362ac2
Update .github/workflows/test_gptq.yml
9de89186
add peft compatibility
ba9b2c99
Apply suggestions from code review doc
e255ca9b
merge examples
b01bbfd7
code review
62ac8bb8
fix test
b0007fc9
make style
19dff004
change var
15727f79
fxmarty
approved these changes
on 2023-08-03
fix doc
c5069472
add exllama
744c2495
change naming
66d71043
more doc
b43d6e0e
SunMarc
merged
9f2943eb
into main 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub