onnxruntime
c1da27c2 - [QNN EP] Add Case-2 LPBQ pattern support for Gemm and Matmul nodes (#25865)

Commit
117 days ago
[QNN EP] Add Case-2 LPBQ pattern support for Gemm and Matmul nodes (#25865) ### Description - Case-2 LPBQ pattern omits QuantizeLinear node in LPBQ packing pattern - Modify LPBQ fusion logic in QNN EP implemented for Gemma and MatMul nodes to gracefully handle the optional QuantizeLinear node in LPBQ packing pattern. - Add unit tests to verify Case-2 LPBQ pattern fusion for Gemm and MatMul nodes. ### Motivation and Context - QuantizeLinear node in LowPowerBlockQuantization encoding packing pattern can be optional as it helps to keep the weights in INT datatype and further helps to reduce the size of model. --------- Co-authored-by: tirupath-qti <tirupath@qti.qualcomm.com>
Author
Parents
Loading