onnxruntime
2ba637c5 - Implement Scale function for quant gemm (#5632)

Commit
5 years ago
Implement Scale function for quant gemm (#5632) * Implement a Scale function for quantization Quantized GEMM is always followed by Scaling (PerTensor Or PerColumn), and often need to be accumulated to an existing matrix. This PR implements a post-processor for quantized GEMM result and accumulate it to another matrix.
Author
Parents
Loading