onnxruntime
559162f2 - fix: propagate output_dtype attribute when inserting Q after DQ (#28144)

Commit
1 day ago
fix: propagate output_dtype attribute when inserting Q after DQ (#28144) ## Summary - `MakeQAttrsFromDQ()` in `qdq_propagation.cc` only copied the DQ's existing attributes (`axis`, `block_size`) when constructing the inserted `QuantizeLinear` node, omitting `output_dtype`. - For opset-21+ graphs whose DQ has no `zero_point` input, `qdq_util.cc` falls back to `UINT8` when `output_dtype` is missing, silently saturating negative `INT8` values to 0. - Inject `output_dtype` derived from the DQ's input element type for opset >= 21, leaving older opsets unchanged. ## Motivation Fixes #27845. Without this fix, enabling QDQ propagation (`ORT_ENABLE_ALL`) produces different — and silently incorrect — outputs versus `ORT_DISABLE_ALL` for any `DequantizeLinear(int8) -> Reshape/Transpose/...` pattern lacking a zero-point input. ## Changes - `onnxruntime/core/optimizer/qdq_transformer/qdq_propagation.cc`: when `dq_node.SinceVersion() >= 21`, read the DQ input's element type from `InputDefs()[0]->TypeAsProto()` and inject it as `output_dtype` on the propagated Q node. Removed the stale `assert(SinceVersion() <= 21)` (`MatchDQNode()` accepts opsets up to 25). - `onnxruntime/test/optimizer/qdq_transformer_test.cc`: new test `QDQPropagation_DQForward_NoZP_OutputDtypeAttribute` parametrized across `INT8`, `UINT8`, `INT16`, `UINT16`, asserting the inserted Q carries `output_dtype` matching the DQ input type. ## Test Plan - `./onnxruntime_test_all --gtest_filter="QDQTransformerTests.QDQPropagation*"` (covers both the new test and the sibling `QDQPropagation_QBackward_NoZP_OutputDtypeAttribute`). - Existing CPU EP CI suites. Fixes #27845
Author
Parents
Loading