Refactor ORTQuantizer (#270)

Commit

3 years ago

Refactor ORTQuantizer (#270) * first draft for transformers free quantizing * rewrote from_pretrained method to support raw onnx files, ortmodels and transformers * added documentation * added documentation * added tests * make style * bump version due to get_processor * bump huggingface_hub due to EntryNotFoundError * applied feedback * renaming of pipeline_task to export_feature * fix benchmarks except for multiple-choice * fix benchmarks except multiple-choice * apply feedback * fix * removed quantization test from optimization file * add fit and remove from_transformers * improved documentation * adjusted different benchmarks * added tokenizer * style check * removed old stuff * removed features * prefix to suffix * added doc suggestions * naming alignment * benchmark tests * applied suggestions * add fix for benchmarks * rename output path to save dir * Update optimum/quantization_base.py Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> * Update optimum/quantization_base.py Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> * Update docs/source/onnxruntime/quantization.mdx Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

References

#270 - Refactor ORTQuantizer

Author

philschmid

Parents

f69264f9

optimum 69f2883d - Refactor ORTQuantizer (#270)

optimum
69f2883d - Refactor ORTQuantizer (#270)