onnxruntime
a7244592 - [QNN EP] Add 16x16 Gemm translation (#24849)

Commit
207 days ago
[QNN EP] Add 16x16 Gemm translation (#24849) ### Description - QNN's 16x16 FC doesn't support asymmetric int16 weight - Insert Convert Op to convert from asymmetric uint16 weight to symmetric int16 weight - Add unit tests to verify 16x16 Gemm translation. ### Motivation and Context This fix schedules 16x16 Gemm Ops on QNN HTP accelerator. This improves inference time of Models contain 16x16 Gemm operators
Author
Parents
Loading