onnxruntime
20cd3394 - [MLAS] AArch64 SQNBitGemm CompInt8 initial multi-row implementation (#21193)

Commit
1 year ago
[MLAS] AArch64 SQNBitGemm CompInt8 initial multi-row implementation (#21193) Update AArch64 SQNBitGemm CompInt8 kernels to process matrix in tiles. E.g., computing the output in 2x2 tiles allows us to compute four elements of the output with one read of two rows of A and two columns of B. Also moved some code around as it was getting big for a single file.
Author
Parents
Loading