Add HQQ quantization support #29637
Minami-su
approved these changes
on 2024-03-29
update HQQ transformers integration
bbc68fee
Merge branch 'huggingface:main' into stable
2a1f2245
push import_utils.py
e1e5df68
add force_hooks check in modeling_utils.py
0192b03b
fix | with Optional
823de372
force bias as param
08d7b8e6
check bias is Tensor
e1fa6c96
force forward for multi-gpu
6e854cae
review fixes pass
2b9f271a
remove torch grad()
5bb9ca25
if any key in linear_tags fix
392e7c5e
add cpu/disk check
20f9ad5b
isinstance return
3a5679a9
add multigpu test + refactor tests
7a1bbca2
clean hqq_utils imports in hqq.py
65b28879
clean hqq_utils imports in quantizer_hqq.py
bba74cd2
delete hqq_utils.py
de88c2af
Delete src/transformers/utils/hqq_utils.py
651a5863
ruff init
d07ea850
remove torch.float16 from __init__ in test
dedf69ec
refactor test
0edf8a43
isinstance -> type in quantizer_hqq.py
c7ec1239
cpu/disk device_map check in quantizer_hqq.py
5283ac20
remove type(module) nn.linear check in quantizer_hqq.py
15daeb48
add BaseQuantizeConfig import inside HqqConfig init
bc4bc73e
remove hqq import in hqq.py
b54e87b2
remove accelerate import from test_hqq.py
0f9698af
quant config.py doc update
d31837fb
add hqqconfig to main_classes doc
b8f792c7
Merge branch 'huggingface:main' into stable
8b84cb1e
make style
9a061e56
__init__ fix
86122823
ruff __init__
b7867932
skip_modules list
e7ba7170
hqqconfig format fix
3a38f210
hqqconfig doc fix
9eee2131
hqqconfig doc fix
03cc8e6c
hqqconfig doc fix
96bd141b
hqqconfig doc fix
713d2261
hqqconfig doc fix
dad9a60d
hqqconfig doc fix
67c0985d
hqqconfig doc fix
94c393a8
hqqconfig doc fix
35fc9f50
hqqconfig doc fix
06f64978
SunMarc
approved these changes
on 2024-04-29
test_hqq.py remove mistral comment
25fde9c7
remove self.using_multi_gpu is False
ee50516c
torch_dtype default val set and logger.info
01d798a4
hqq.py isinstance fix
a909ca8a
remove torch=None
c466c89a
torch_device test_hqq
d522fed9
rename test_hqq
a09e90ff
MODEL_ID in test_hqq
5bdf40f4
quantizer_hqq setattr fix
e693d473
quantizer_hqq typo fix
f5cabe58
imports quantizer_hqq.py
5ede086e
isinstance quantizer_hqq
c86000bc
hqq_layer.bias reformat quantizer_hqq
7d3e0839
Step 2 as comment in quantizer_hqq
082dfea5
prepare_for_hqq_linear() comment
667f1adb
keep_in_fp32_modules fix
e0cd7846
HqqHfQuantizer reformat
5d3b504e
quantization.md hqqconfig
cc1961cb
quantization.md model example reformat
9aa9e15a
quantization.md # space
9273e21d
quantization.md space })
f29e7a4e
quantization.md space })
5168852d
quantization_config fix doc
0dfe0806
axis value check in quantization_config
29340526
format
bc7cf4ee
dynamic config explanation
d33f944a
quant config method in quantization.md
3522f0a6
remove shard-level progress
cc14c211
.cuda fix modeling_utils
1e81036f
test_hqq fixes
ca07f5a3
Merge branch 'huggingface:main' into stable
4cc776e4
make fix-copies
3d777ed7
Merge branch 'huggingface:main' into stable
b8088581
Merge branch 'huggingface:main' into stable
5e711390
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub