Fully deprecate AutoGPTQ and AutoAWQ for GPT-QModel (#41567)
* fully deprecate autogptq
* remove use_cuda and use_exllama toggles are fully deprecated in gptqmodel
* format
* add `act_group_aware` property
* fix QUANT_TYPE assert
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* format
* mod awq import
* remove autoawq fuse support
* remove remove autoawq.config fuse
* cleanup
* remove awq fuse test
* fix import
* use gptqmodel
* cleanup
* remove get_modules_to_fuse
* mod require_auto_awq -> require_gptqmodel
* convert vertion to checkpoint_format
* check is_gptqmodel_available
* revert modules_to_not_convert
* pass bits, sym, desc_act
* fix awqconfig init
* fix wrong args
* fix ipex
* mod ipex version check
* cleanup
* fix awq_linear
* remove self.exllama_config = exllama_config
* cleanuo
* Revert "cleanuo"
This reverts commit 90019c6fc4f7a617ed9db482a42ecd1cd07f9108.
* update is_trainable
* cleanup
* remove fused
* call hf_select_quant_linear_v2()
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* Remove the "version" field from AwqConfig
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* Add torch_fused inferencefix test_gptq test
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* fix test_awq
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* fix test_awq
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* fix AwqConfig
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* call hf_select_quant_linear_v2()
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* remove auto_awq
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* fix typo
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* Compatible with legacy field: checkpoint_format
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* Compatible with legacy field: checkpoint_format
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* format
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* CLEANUP
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* update test_awq
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* fix get_modules_to_not_convert()
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* fix test_awq.py::AwqTest::test_quantized_model_exllama
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* Apply style fixes
* test_awq.py added EXPECTED_OUTPUT
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* update test_gptq.py
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* fix test_awq.py::AwqTest::test_save_pretrained
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* use assertEqual() instead of assertTrue()
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* fix test_quantized_layers_class()
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* remove ExllamaV1 Test
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* format
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* fix get_modules_to_not_convert()
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* added EXPECTED_OUTPUT
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* remove ExllamaV1 Test
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* add AwqBackend.AUTO_TRAINABLE
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
* Update docs/source/zh/llm_tutorial.md
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
* revert temporarily fix
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
---------
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: LRL2-ModelCloud <lrl2@modelcloud.ai>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>