onnxruntime
3df3a851 - Default kOrtSessionOptionsDisableQuantQDQ to 1 when the DML EP is registered (#15725)

Commit

2 years ago

Default kOrtSessionOptionsDisableQuantQDQ to 1 when the DML EP is registered (#15725) This addresses a performance regression in some INT8 models with the DirectML EP by defaulting OrtSessionOptionsDisableQuantQDQ to 1 when the EP is registered. This regression occured due to the introduction of the QDQ propagation transformer, which is based on this session option. That transformer maximizes the number of nodes which are executed as quantized by logically propagating quantize operators upstream and dequantize operators downstream. However, it does this simply by inserting QDQ pairs, with an expectation that something will recognize sequences of DQ->Op->Q. This logic and related L2 transformers are not currently enabled for the DirectML EP. This change also removes a noisy warning when the session option for memory pattern is overriden as the DirectML EP is registered.

References

#15725 - Default kOrtSessionOptionsDisableQuantQDQ to 1 when the DML EP is registered

Author

jeffbloo

Parents

10dff4f6

onnxruntime 3df3a851 - Default kOrtSessionOptionsDisableQuantQDQ to 1 when the DML EP is registered (#15725)

onnxruntime
3df3a851 - Default kOrtSessionOptionsDisableQuantQDQ to 1 when the DML EP is registered (#15725)