onnxruntime
cbdd0bb7 - QAttention calls into MatMulIntToFloat instead of Dequantize+GEMM (#16851)

Commit

2 years ago

QAttention calls into MatMulIntToFloat instead of Dequantize+GEMM (#16851) ### Description Update QAttention calling into MatMulIntToFloat instead of Dequantize+GEMM to enable more metacommand path.

References

#16851 - QAttention calls into MatMulIntToFloat instead of Dequantize+GEMM

Author

zhangxiang1993

Parents

c19e4c02

onnxruntime cbdd0bb7 - QAttention calls into MatMulIntToFloat instead of Dequantize+GEMM (#16851)

onnxruntime
cbdd0bb7 - QAttention calls into MatMulIntToFloat instead of Dequantize+GEMM (#16851)