transformers
53f8a082 - SINQ quantization strategy integration (adapted for Transformers V5) (#43112)

Commit
26 days ago
SINQ quantization strategy integration (adapted for Transformers V5) (#43112) * sinq integration files * sinq integration update * sinq integration no lazy import * Tests for sinq integration * minor changes to sinq integration * sinq integration documentation added * small correction to sinq documentation * small correction to sinq documentation * remove auto_patch_io flag and fix the selection of the device for sinq integration * remove auto_patch_io flag and fix the selection of the device for sinq integration * remove auto_patch_io flag and fix the selection of the device for sinq integration * remove auto_patch_io flag and fix the selection of the device for sinq integration * Code style fix sinq integration * minor changes in comments for sinq integration * add for documentation sinq integration * add documentation for sinq integration * minor adjustment in sinq quantizer * minor changes to sinq integration * delete debugging print in sinq integration * sinq integration files * sinq integration update * sinq integration no lazy import * Tests for sinq integration * minor changes to sinq integration * sinq integration documentation added * small correction to sinq documentation * small correction to sinq documentation * remove auto_patch_io flag and fix the selection of the device for sinq integration * remove auto_patch_io flag and fix the selection of the device for sinq integration * remove auto_patch_io flag and fix the selection of the device for sinq integration * remove auto_patch_io flag and fix the selection of the device for sinq integration * Code style fix sinq integration * minor changes in comments for sinq integration * add for documentation sinq integration * add documentation for sinq integration * minor adjustment in sinq quantizer * minor changes to sinq integration * delete debugging print in sinq integration * Adapt sinq integration to transformers v5 * sinq integration for transformers v5 * Added part of the suggested modifications to make the code simpler * Modification of the quantization flow and remove of asinq option * Minor adjustments and creation of fuction to substitute quantized layers * Eliminate device specification in SinqConfig and tests script adaptation * final adjustments and checks * final checks * Fix merge conflict in import_utils * Fix code quality * update tests and fixing minor typos * Fix grammar of warning message * Update tests/quantization/sinq/test_sinq.py * Apply repo consistency fixes --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Author
Parents
Loading