Drop QDQ around more nodes (#21376)

Commit

1 year ago

Drop QDQ around more nodes (#21376) ### Description Extends the Drop QDQ optimization to remove DequantizeLinear and QuantizeLinear nodes from around operators: - Flatten - Expand - Tile - Slice - GatherElements - ReduceMin - ReduceMax ### Motivation and Context To reduce floating-point conversions in quantize inference. Mainly motivated by the Flatten case, since that will show up in graphs exported from PyTorch to ONNX. But to make the change complete, extending to a larger set of ops for which this optimization is valid. https://github.com/microsoft/onnxruntime/issues/21375 --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>

References

#21376 - Drop QDQ around more nodes

Author

mcollinswisc

Parents

6e575769

onnxruntime 5d54dc14 - Drop QDQ around more nodes (#21376)

onnxruntime
5d54dc14 - Drop QDQ around more nodes (#21376)