transformers
2963e196 - Add support for loading GPTQ models on CPU (#26719)

Commit

2 years ago

Add support for loading GPTQ models on CPU (#26719) * Add support for loading GPTQ models on CPU Right now, we can only load the GPTQ Quantized model on the CUDA device. The attribute `gptq_supports_cpu` checks if the current auto_gptq version is the one which has the cpu support for the model or not. The larger variants of the model are hard to load/run/trace on the GPU and that's the rationale behind adding this attribute. Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com> * Update quantization.md * Update quantization.md * Update quantization.md

References

#26719 - Add support for loading GPTQ models on CPU

#27720 - Add common processor tests

#29969 - [SigLIP] Add fast tokenizer

#32831 - [Docs] Update resources

#33111 - [Backbone] Remove out_features everywhere

#33174 - [Zero-shot image classification pipeline] Remove tokenizer_kwargs

#39821 - Support MetaCLIP 2

#59 - Fix attention mask handling in EoMT-DINOv3 converter

#62 - Add initial DEIMv2 model implementation

#65 - Fix RTDetrV2 sine position embedding ordering

Author

vivekkhandelwal1

Parents

3cd3eaf9

transformers 2963e196 - Add support for loading GPTQ models on CPU (#26719)

transformers
2963e196 - Add support for loading GPTQ models on CPU (#26719)