onnxruntime
b47e1e64 - [QNN EP] Make offloading graph input/output quantization (to CPU) the default (#23368)

Commit
331 days ago
[QNN EP] Make offloading graph input/output quantization (to CPU) the default (#23368) ### Description Makes the QNN provider option `offload_graph_io_quantization` enabled by default. It was previously disabled by default. ### Motivation and Context Enabling this option significantly decreases inference latency for many models.
Parents
Loading