onnxruntime
46e8d454 - [QNN EP] Add FusedMatMul operator support (#27044)

Commit

189 days ago

[QNN EP] Add FusedMatMul operator support (#27044) ### Description Add support for the FusedMatMul operator in the QNN execution provider. FusedMatMul is a contrib operator in the Microsoft domain that performs a fused matrix multiplication with optional bias addition and activation. Implementation details: - Added FusedMatMulOpBuilder class that decomposes FusedMatMul into: 1. MatMul operation 2. Optional bias addition 3. Optional activation (Relu, Sigmoid, Tanh, Gelu) - Handles various attributes: transA, transB, alpha, and activation - Supports higher rank tensors and different data types Added comprehensive tests: - Basic functionality tests with various configurations - Tests for both CPU and HTP backends - QDQ (Quantize-Dequantize) tests for 8-bit and 16-bit precision ### Motivation and Context Since QNN HTP doesn't support, decomposing it into QNN HTP supported operators to improve the inference time of customer models having FusedMatMul operator.

References

#27044 - [QNN EP] Add FusedMatMul operator support

Author

tirupath-qti

Parents

fd21d0aa

onnxruntime 46e8d454 - [QNN EP] Add FusedMatMul operator support (#27044)

onnxruntime
46e8d454 - [QNN EP] Add FusedMatMul operator support (#27044)