onnxruntime
2ba637c5 - Implement Scale function for quant gemm (#5632)

Commit

5 years ago

Implement Scale function for quant gemm (#5632) * Implement a Scale function for quantization Quantized GEMM is always followed by Scaling (PerTensor Or PerColumn), and often need to be accumulated to an existing matrix. This PR implements a post-processor for quantized GEMM result and accumulate it to another matrix.

References

#5632 - Implement Scale function for quant gemm

Author

yufenglee

Parents

cca8cd84

onnxruntime 2ba637c5 - Implement Scale function for quant gemm (#5632)

onnxruntime
2ba637c5 - Implement Scale function for quant gemm (#5632)