onnxruntime
155e22d1
- MLAS: fuse float output into quantized GEMM (#4215)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
5 years ago
MLAS: fuse float output into quantized GEMM (#4215) Add more variants of MlasGemm that do a u8x8 GEMM with the output type as float. This fuses the common sequence of MatMulInteger + Cast + Mul(OutputScale) + optional Add(BiasVector).
References
#4215 - MLAS: fuse float output into quantized GEMM
Author
tracysh
Parents
2e3607c7
Loading