Refactor ORTQuantizer (#270)
* first draft for transformers free quantizing
* rewrote from_pretrained method to support raw onnx files, ortmodels and transformers
* added documentation
* added documentation
* added tests
* make style
* bump version due to get_processor
* bump huggingface_hub due to EntryNotFoundError
* applied feedback
* renaming of pipeline_task to export_feature
* fix benchmarks except for multiple-choice
* fix benchmarks except multiple-choice
* apply feedback
* fix
* removed quantization test from optimization file
* add fit and remove from_transformers
* improved documentation
* adjusted different benchmarks
* added tokenizer
* style check
* removed old stuff
* removed features
* prefix to suffix
* added doc suggestions
* naming alignment
* benchmark tests
* applied suggestions
* add fix for benchmarks
* rename output path to save dir
* Update optimum/quantization_base.py
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
* Update optimum/quantization_base.py
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
* Update docs/source/onnxruntime/quantization.mdx
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>