onnxruntime
4d3cd2f6 - Add helper for optimizing a QDQ format model for usage with ORT. (#10595)

Commit

4 years ago

Add helper for optimizing a QDQ format model for usage with ORT. (#10595) * Add initial helper for optimizing a QDQ format model for usage with ORT. If a DQ node has multiple consumers it will end up in multiple QDQ node units. This is complicated to handle as each qdq unit could end up being handled by different execution providers. By duplicating the DQ node we simplify this logic. Generally the duplicate nodes will disappear when the qdq node unit is converted to a single node with a quantized operator. If there are qdq node units that are not able to be converted to use a quantized operator the ORT cleanup (pending) to drop remaining Q->DQ pairs between fp32 nodes can remove any remaining DQ nodes. * Fix pep8 warning Co-authored-by: Guoyu Wang <wanggy@outlook.com>

References

#10595 - Add helper for optimizing a QDQ format model for usage with ORT.

Author

skottmckay

Parents

4a79ed62

onnxruntime 4d3cd2f6 - Add helper for optimizing a QDQ format model for usage with ORT. (#10595)

onnxruntime
4d3cd2f6 - Add helper for optimizing a QDQ format model for usage with ORT. (#10595)