onnxruntime
[MLAS] AArch64 SQNBitGemm CompInt8 initial multi-row implementation
#21193
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
14
Changes
View On
GitHub
Commits
initial impl for m=2 kernel that computes 2x2 outputs at a time, for blklen > 32
edgchen1
committed
1 year ago
support blklen 32, zero point
edgchen1
committed
1 year ago
implement tiling for blklen 32
edgchen1
committed
1 year ago
move to tiling approach for sqnbitgemm compint8 impl
edgchen1
committed
1 year ago
fix returned registered test count
edgchen1
committed
1 year ago
use variable for HasZeroPoint template parameter value
edgchen1
committed
1 year ago
split out sqnbitgemm ARM NEON impl into multiple files
edgchen1
committed
1 year ago
Merge remote-tracking branch 'origin/main' into edgchen1/sqnbitgemm_multi_row
edgchen1
committed
1 year ago
update sqnbitgemm avx code to use new SQ4BitGemmKernel_CompInt8 interface
edgchen1
committed
1 year ago
put impl into unnamed namespace, comment
edgchen1
committed
1 year ago
fix zp loading
edgchen1
committed
1 year ago
fix post processor call arguments
edgchen1
committed
1 year ago
helper functions for advancing row/col ptrs
edgchen1
committed
1 year ago
fix indentation
edgchen1
committed
1 year ago
Loading