[MLAS] Add 8-bit weights ARM64 Gemm implementation #25110
finished prepack
b52a1ce1
changed interface to support blocksum2
05231068
finished quantb for quant a unsigned
fd92ab89
finished quantize a
ed5cf8d2
finished Q8Int8GemmR2xC8Neon
b9b9691e
finished kernels
685baffa
fixed build
67473307
passed prepack
b0873174
finished ut for quant a
196c04c2
fixed build
353d460b
Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
4d62e32c
Comment out some 4 bit tests
e88e32d7
Apple I8MM check
58011b04
Tests
acc4b812
Tests 2
2700493c
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
76de326a
Changes
159d4d32
Fixes
e4bc74ea
Re-enable 4 bit tests
e92055bd
Stage
94f30224
Some tests work
61c18728
Git attempt
16da92b0
Lint attempt
3ce481d9
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
29f66bd4
More changesc
987574bc
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
d921b06a
Fix tests
cf92e6f1
Stage
8156fc79
Stage
9a1fe225
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
31c8f931
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
92ec5ff0
Try fix x86 builds
7159d5e7
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
7ad1d36c
Try fix lint errors
03f29166
Yipee zero point tests are all passing
47420b51
Comments and Nits
2a5100de
hariharans29
changed the title [DO NOT REVIEW] [MLAS] 8 bit weights ARM64 Matmul implementation WIP: [MLAS] 8 bit weights ARM64 Matmul implementation 275 days ago
Enable MatmulNBits test
d64568b2
Fixes
0c557559
Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
01d4a98a
a
c8188d44
I8MM support re-enable
635eec9b
Fix warning
f736faea
Enable tests with ZP = false
aa794671
hariharans29
changed the title WIP: [MLAS] 8 bit weights ARM64 Matmul implementation [MLAS] 8 bit weights ARM64 Matmul implementation 274 days ago
hariharans29
changed the title [MLAS] 8 bit weights ARM64 Matmul implementation [MLAS] Add 8-bit weights ARM64 Gemm implementation 274 days ago
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
10e3afa2
I8MM fixes
c4331e01
Remove unnecessary template
5b7c3af7
Resolve conflicts and update PR with more fixes
9ae58eec
Fix warning
b6cd309d
Properly remove warning
98f5fe0f
Merge remote-tracking branch 'origin' into hari/matmul8bits_arm
0d9442b4
PR feedback
9c2faa66
Refine
47e2420e
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
5eb9ed9c
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
bb978f72
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
9b5c3891
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
76d085b1
Update onnxruntime/test/contrib_ops/matmul_8bits_test.cc
12e3a1d1
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
3827317a
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
d8f4235e
hariharans29
force pushed
from
2d2bf2b0
to
47e2420e
207 days ago
Ignore sending scales while pre-packing weights on ARM64
46aa3629
Fix warning
2c956ae7
4 bit fix
83296bba
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
8f145007
Update onnxruntime/test/contrib_ops/matmul_8bits_test.cc
405105b2
Lint
303e8677
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
91de908d
Fix lintrunner mess-up once and for all
890a046a
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
ec0c8abe
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
1f71f6cd
Lint
376fc1b5
jywu-msft
dismissed these changes
on 2025-09-03
hariharans29
changed the title [MLAS] Add 8-bit weights ARM64 Gemm implementation [DO NOT MERGE][MLAS] Add 8-bit weights ARM64 Gemm implementation 206 days ago
More fixes
eefa72c0
hariharans29
dismissed their stale review
via eefa72c0
206 days ago
hariharans29
marked this pull request as ready for review 206 days ago
hariharans29
changed the title [DO NOT MERGE][MLAS] Add 8-bit weights ARM64 Gemm implementation [MLAS] Add 8-bit weights ARM64 Gemm implementation 206 days ago
edgchen1
dismissed these changes
on 2025-09-04
PR comments
e1da3d5c
hariharans29
dismissed their stale review
via e1da3d5c
205 days ago
Missed out on one
77dff226
edgchen1
dismissed these changes
on 2025-09-04
Remove guards
7404cb37
hariharans29
dismissed their stale review
via 7404cb37
205 days ago
Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
edb3d728
edgchen1
approved these changes
on 2025-09-04
hariharans29
deleted the hari/matmul8bits_arm branch 205 days ago
Login to write a write a comment.
Login via GitHub