onnxruntime
c1da27c2 - [QNN EP] Add Case-2 LPBQ pattern support for Gemm and Matmul nodes (#25865)

Commit

117 days ago

[QNN EP] Add Case-2 LPBQ pattern support for Gemm and Matmul nodes (#25865) ### Description - Case-2 LPBQ pattern omits QuantizeLinear node in LPBQ packing pattern - Modify LPBQ fusion logic in QNN EP implemented for Gemma and MatMul nodes to gracefully handle the optional QuantizeLinear node in LPBQ packing pattern. - Add unit tests to verify Case-2 LPBQ pattern fusion for Gemm and MatMul nodes. ### Motivation and Context - QuantizeLinear node in LowPowerBlockQuantization encoding packing pattern can be optional as it helps to keep the weights in INT datatype and further helps to reduce the size of model. --------- Co-authored-by: tirupath-qti <tirupath@qti.qualcomm.com>

References

#25865 - [QNN EP] Add Case-2 LPBQ pattern support for Gemm and Matmul nodes

Author

quic-tirupath

Parents

e6110d03

onnxruntime c1da27c2 - [QNN EP] Add Case-2 LPBQ pattern support for Gemm and Matmul nodes (#25865)

onnxruntime
c1da27c2 - [QNN EP] Add Case-2 LPBQ pattern support for Gemm and Matmul nodes (#25865)