Quantization with TFLite (#854)

Commit

2 years ago

Quantization with TFLite (#854) * Cleaning DatasetProcessor * Cache * Cache * Starting working * WIP * [WIP] Big refactor of TaskProcessing classes * [WIP] tests * [WIP] tests almost done * [WIP] tests almost done * [WIP] tests done * Quantization working * Remove dependency on evaluate * Renaming file * Adding torchvision as a dependency * [WIP] quantization tests * Fix get_task_from_model * Styling * Fix bad argument name * Fix batching * Add quantization approach * Fix stable diffusion test * Add CLI tests * Load smallest split if not provided * Fix fallback_to_float argument * Skipping benchmark tests * Decouple github actions to make tests faster * Mark test in tests/exporters/tflite/test_tflite_export.py as well * Add docstrings * Add argument description * Styling * Skipping tests for unsupported tasks * [WIP] make export quantization arguments a dataclass * Apply suggestions * Fix token-classification for failing models * Fix question-answering tests * Filter which models support int8x16 * Fix image-classification image-key * Styling * Remove int8x16 for roformer * Fix question answering issue with null columns * Disable int8 quantization for Deberta and XLM-Roberta on question-answering for now * Disable int8 quantization for Deberta * Rename QuantizationConfig to TFLiteQuantizationConfig * Add warnings when guessing the data keys * fix import in test * Remove pytest.ini * Fuse get_task_from_model with infer_task_from_model * Add log if smallest split is used

References

#854 - Quantization with TFLite

Author

michaelbenayoun

Parents

35c91268

optimum 4f95f593 - Quantization with TFLite (#854)

optimum
4f95f593 - Quantization with TFLite (#854)