onnxruntime
0ebd8cc9 - fix: auto-upgrade model opset to 21 for int16/uint16 QDQ quantization (#28202)

Commit

3 days ago

fix: auto-upgrade model opset to 21 for int16/uint16 QDQ quantization (#28202) ## Summary - Extends existing `update_opset_version` helper to auto-bump opset from < 21 to 21 when QUInt16/QInt16 weight quantization is requested - Mirrored after the existing float8 quantization opset upgrade pattern - Adds test coverage with parametric subtests for 16-bit int quantization ## Motivation Fixes #25223. Users exporting models from `torch.export` with uint16/int16 quantization hit a gap where models below opset 21 were not being upgraded. Mirroring the existing float8 branch gives users a consistent, predictable upgrade path for 16-bit QDQ. ## Changes - `onnxruntime/python/tools/quantization/quant_utils.py`: new `elif` branch in `update_opset_version` that bumps opset to 21 when `weight_quant_type` is `INT16` / `UINT16` and current opset is < 21. Emits a warning matching the existing float8 branch style. - `onnxruntime/test/python/quantization/test_quant_util.py`: new `test_update_opset_version_16bit` with parametric subtests covering `QUInt16` / `QInt16` bumping from opset 20 → 21 and a no-op regression check for models already at opset 21. ## Test Plan ``` python -m pytest onnxruntime/test/python/quantization/test_quant_util.py -v ``` All tests pass. `lintrunner -a` produces no changes.

References

#28202 - fix: auto-upgrade model opset to 21 for int16/uint16 QDQ quantization

Author

Rishi-Dave

Parents

4d9aa47a

onnxruntime 0ebd8cc9 - fix: auto-upgrade model opset to 21 for int16/uint16 QDQ quantization (#28202)

onnxruntime
0ebd8cc9 - fix: auto-upgrade model opset to 21 for int16/uint16 QDQ quantization (#28202)