onnxruntime
a47254ea - Remove empty (DQ -> Q -> graph output) sequence in TransposeOptimizer (#22172)

Commit
1 year ago
Remove empty (DQ -> Q -> graph output) sequence in TransposeOptimizer (#22172) ### Description Updates the TransposeOptimizer to also remove empty (DQ -> Q) sequences that occur at a graph output. An empty DQ->Q sequence results from a Transpose being optimized out. Consider the following example model: ![image](https://github.com/user-attachments/assets/4e7bc4eb-ea8a-463b-9672-c4ec5ef779b2) The TransposeOptimizer removes the final Transpose and leaves an empty DQ->Q->output_0 sequence. This PR ensures that the final DQ->Q is also removed. ### Motivation and Context Models with quantized output can run on QNN EP. The inference latency of a customer model is impacted by the unnecessary DQ->Q sequence at the output. --------- Co-authored-by: Scott McKay <skottmckay@gmail.com>
Parents
Loading