onnxruntime
05fc0c60 - [MLAS][AArch64] SQNBitGemm CompInt8 - Use 4x2 tiles (#21380)

Commit
1 year ago
[MLAS][AArch64] SQNBitGemm CompInt8 - Use 4x2 tiles (#21380) Update SQNBitGemm ARM NEON kernel to compute 4x2 tile of output. Note: Also tried 2x4 and 4x4 tiles but observed the best microbenchmark results with 4x2 tiles.
Author
Parents
Loading