onnxruntime
a7244592 - [QNN EP] Add 16x16 Gemm translation (#24849)

Commit

207 days ago

[QNN EP] Add 16x16 Gemm translation (#24849) ### Description - QNN's 16x16 FC doesn't support asymmetric int16 weight - Insert Convert Op to convert from asymmetric uint16 weight to symmetric int16 weight - Add unit tests to verify 16x16 Gemm translation. ### Motivation and Context This fix schedules 16x16 Gemm Ops on QNN HTP accelerator. This improves inference time of Models contain 16x16 Gemm operators

References

#24849 - [QNN EP] Add 16x16 Gemm translation

Author

quic-tirupath

Parents

97d8d90d

onnxruntime a7244592 - [QNN EP] Add 16x16 Gemm translation (#24849)

onnxruntime
a7244592 - [QNN EP] Add 16x16 Gemm translation (#24849)