onnxruntime
[MLAS] Add 8-bit weights ARM64 Gemm implementation
#25110
Merged

[MLAS] Add 8-bit weights ARM64 Gemm implementation #25110

hariharans29 merged 75 commits into main from hari/matmul8bits_arm
hariharans29
fajin-corp finished prepack
b52a1ce1
fajin-corp changed interface to support blocksum2
05231068
fajin-corp finished quantb for quant a unsigned
fd92ab89
fajin-corp finished quantize a
ed5cf8d2
fajin-corp finished Q8Int8GemmR2xC8Neon
b9b9691e
fajin-corp finished kernels
685baffa
fajin-corp fixed build
67473307
fajin-corp passed prepack
b0873174
fajin-corp finished ut for quant a
196c04c2
fajin-corp fixed build
353d460b
hariharans29 Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
4d62e32c
github-actions
github-actions commented on 2025-06-18
hariharans29 Comment out some 4 bit tests
e88e32d7
hariharans29 Apple I8MM check
58011b04
hariharans29 Tests
acc4b812
hariharans29 Tests 2
2700493c
hariharans29 Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
76de326a
hariharans29 Changes
159d4d32
hariharans29 Fixes
e4bc74ea
hariharans29 Re-enable 4 bit tests
e92055bd
hariharans29 Stage
94f30224
github-actions
github-actions commented on 2025-06-25
hariharans29 Some tests work
61c18728
hariharans29 Git attempt
16da92b0
hariharans29 Lint attempt
3ce481d9
github-actions
github-actions commented on 2025-06-25
hariharans29 Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
29f66bd4
hariharans29 More changesc
987574bc
hariharans29 Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
d921b06a
hariharans29 Fix tests
cf92e6f1
github-actions
github-actions commented on 2025-06-25
hariharans29 Stage
8156fc79
hariharans29 Stage
9a1fe225
github-actions
github-actions commented on 2025-06-26
hariharans29 Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
31c8f931
hariharans29 Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
92ec5ff0
hariharans29 Try fix x86 builds
7159d5e7
hariharans29 Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
7ad1d36c
hariharans29 Try fix lint errors
03f29166
hariharans29 Yipee zero point tests are all passing
47420b51
hariharans29 Comments and Nits
2a5100de
hariharans29 hariharans29 changed the title [DO NOT REVIEW] [MLAS] 8 bit weights ARM64 Matmul implementation WIP: [MLAS] 8 bit weights ARM64 Matmul implementation 275 days ago
hariharans29 Enable MatmulNBits test
d64568b2
hariharans29 Fixes
0c557559
hariharans29 Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
01d4a98a
hariharans29 a
c8188d44
hariharans29 I8MM support re-enable
635eec9b
hariharans29 Fix warning
f736faea
hariharans29 Enable tests with ZP = false
aa794671
hariharans29 hariharans29 changed the title WIP: [MLAS] 8 bit weights ARM64 Matmul implementation [MLAS] 8 bit weights ARM64 Matmul implementation 274 days ago
github-actions
github-actions commented on 2025-06-28
hariharans29 hariharans29 changed the title [MLAS] 8 bit weights ARM64 Matmul implementation [MLAS] Add 8-bit weights ARM64 Gemm implementation 274 days ago
hariharans29 Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
10e3afa2
hariharans29 hariharans29 assigned jywu-msft jywu-msft 274 days ago
hariharans29 hariharans29 assigned edgchen1 edgchen1 274 days ago
hariharans29 I8MM fixes
c4331e01
hariharans29
hariharans29 commented on 2025-06-28
hariharans29 Remove unnecessary template
5b7c3af7
jywu-msft jywu-msft requested a review from edgchen1 edgchen1 273 days ago
jywu-msft jywu-msft requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 273 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2025-06-28
vraspar
vraspar commented on 2025-07-07
hariharans29 Resolve conflicts and update PR with more fixes
9ae58eec
hariharans29 Fix warning
b6cd309d
hariharans29 Properly remove warning
98f5fe0f
hariharans29 Merge remote-tracking branch 'origin' into hari/matmul8bits_arm
0d9442b4
edgchen1
edgchen1 commented on 2025-08-09
edgchen1
edgchen1 commented on 2025-08-12
hariharans29 PR feedback
9c2faa66
github-actions
github-actions commented on 2025-09-02
hariharans29 Refine
47e2420e
github-actions
github-actions commented on 2025-09-03
hariharans29 Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
5eb9ed9c
hariharans29 Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
bb978f72
hariharans29 Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
9b5c3891
hariharans29 Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
76d085b1
hariharans29 Update onnxruntime/test/contrib_ops/matmul_8bits_test.cc
12e3a1d1
hariharans29 Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
3827317a
hariharans29 Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
d8f4235e
hariharans29 hariharans29 added release:1.23.0
hariharans29 hariharans29 force pushed from 2d2bf2b0 to 47e2420e 207 days ago
github-actions
github-actions commented on 2025-09-03
hariharans29 Ignore sending scales while pre-packing weights on ARM64
46aa3629
hariharans29 Fix warning
2c956ae7
github-actions
github-actions commented on 2025-09-03
hariharans29 4 bit fix
83296bba
github-actions
github-actions commented on 2025-09-03
hariharans29 Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
8f145007
hariharans29 Update onnxruntime/test/contrib_ops/matmul_8bits_test.cc
405105b2
hariharans29 Lint
303e8677
hariharans29 Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
91de908d
hariharans29 Fix lintrunner mess-up once and for all
890a046a
hariharans29 hariharans29 requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 206 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2025-09-03
github-actions
github-actions commented on 2025-09-03
hariharans29 Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
ec0c8abe
hariharans29 Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
1f71f6cd
hariharans29 Lint
376fc1b5
jywu-msft
jywu-msft dismissed these changes on 2025-09-03
hariharans29 hariharans29 changed the title [MLAS] Add 8-bit weights ARM64 Gemm implementation [DO NOT MERGE][MLAS] Add 8-bit weights ARM64 Gemm implementation 206 days ago
hariharans29
hariharans29 hariharans29 marked this pull request as draft 206 days ago
hariharans29 More fixes
eefa72c0
hariharans29 hariharans29 dismissed their stale review via eefa72c0 206 days ago
hariharans29 hariharans29 marked this pull request as ready for review 206 days ago
hariharans29 hariharans29 changed the title [DO NOT MERGE][MLAS] Add 8-bit weights ARM64 Gemm implementation [MLAS] Add 8-bit weights ARM64 Gemm implementation 206 days ago
edgchen1
edgchen1 commented on 2025-09-03
edgchen1
edgchen1 dismissed these changes on 2025-09-04
hariharans29 PR comments
e1da3d5c
hariharans29 hariharans29 dismissed their stale review via e1da3d5c 205 days ago
hariharans29 Missed out on one
77dff226
edgchen1
edgchen1 dismissed these changes on 2025-09-04
hariharans29 Remove guards
7404cb37
hariharans29 hariharans29 dismissed their stale review via 7404cb37 205 days ago
hariharans29 Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
edb3d728
edgchen1
edgchen1 approved these changes on 2025-09-04
hariharans29
hariharans29 hariharans29 merged 31dcc606 into main 205 days ago
hariharans29 hariharans29 deleted the hari/matmul8bits_arm branch 205 days ago
tianleiwu tianleiwu added cherry-picked
tianleiwu tianleiwu removed release:1.23.0

Login to write a write a comment.

Login via GitHub

Assignees
Labels
Milestone