optimum
3a85d512 - Fully deprecate AutoGPTQ for GPT-QModel (#2385)

Commit
41 days ago
Fully deprecate AutoGPTQ for GPT-QModel (#2385) * call hf_select_quant_linear_v2() Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * Remove the import of auto_gptq Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * Remove the import of auto_gptq in unittests Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * Fixed the incorrect layer_output when using the latest version of transformers. Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * use assertEqual instead of assertTrue Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * load_quantized_model() remove "disable_exllama" and "exllama_config" arguments Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * fix gptq unittest Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * fix gptq unittest Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * setting offload_to_disk=False Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * remove auto_gptq Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * format Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * format Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * fix * fix * remove ExllamaV1 Test Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * update ci * test * avoid uv * re-enable uv and pin to >= 5.6.12 * Update .github/workflows/test_gptq.yml Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> --------- Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: IlyasMoutawwakil <moutawwakil.ilyas.tsi@gmail.com> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Author
Parents
Loading