onnxruntime
155e22d1 - MLAS: fuse float output into quantized GEMM (#4215)

Commit
5 years ago
MLAS: fuse float output into quantized GEMM (#4215) Add more variants of MlasGemm that do a u8x8 GEMM with the output type as float. This fuses the common sequence of MatMulInteger + Cast + Mul(OutputScale) + optional Add(BiasVector).
Author
Parents
Loading