onnxruntime
[MLAS AArch64] SQNBitGemm CompInt8 kernel
#18953
Merged

[MLAS AArch64] SQNBitGemm CompInt8 kernel #18953

edgchen1 merged 36 commits into main from edgchen1/sqnbitgemm_quantize_a
edgchen1
edgchen1 only register q4gemm benchmarks if q4gemm is available
8940c0a5
edgchen1 some mlas cmake updates
a6a8ce62
edgchen1 change BlkLen from template param to function param
53a46ca8
edgchen1 Save work
e2a9eee8
edgchen1 only enable benchmark if available
966a9150
edgchen1 handle workspace in benchmark
b59e7e13
edgchen1 QuantizeARow neon impl1
585103be
edgchen1 dot compint8 neon impl
c26cef4f
edgchen1 use single workspace pointer in interface, get matmul_nbits working
1b7d81b4
edgchen1 Merge remote-tracking branch 'origin/main' into edgchen1/sqnbitgemm_q…
f7e3db50
edgchen1 renaming and cleanup
71bd3a92
edgchen1 try different comp types in matmulnbits
f7127f9f
edgchen1 Merge remote-tracking branch 'origin/main' into edgchen1/sqnbitgemm_q…
0060f554
edgchen1 rename enum, add doc
b3147c6c
edgchen1 change quant b params from uint8_t* to std::byte*
789bcdcd
edgchen1 handle CompUndef
039dd92b
edgchen1
edgchen1 commented on 2023-12-29
edgchen1 edgchen1 changed the title [MLAS AArch64] SQNBitGemm CompInt8 kernel [WIP][MLAS AArch64] SQNBitGemm CompInt8 kernel 1 year ago
edgchen1 check if dot product instructions are available before setting SQNBit…
cb9f4287
edgchen1 try to fix compile issue
437ad52a
edgchen1 move zero initialize out of unrolled loop
241ca27d
edgchen1 update comment
53e2ae29
edgchen1 split out float conversion
d5b26b4d
edgchen1 remove impl0_reference
02cf7b37
edgchen1 use thread per gemm in prepare workspace fn, reorder include
5b4a86c7
edgchen1 edgchen1 changed the title [WIP][MLAS AArch64] SQNBitGemm CompInt8 kernel [MLAS AArch64] SQNBitGemm CompInt8 kernel 1 year ago
edgchen1 edgchen1 marked this pull request as ready for review 1 year ago
edgchen1 edgchen1 requested a review 1 year ago
edgchen1 edgchen1 requested a review from skottmckay skottmckay 1 year ago
edgchen1 edgchen1 requested a review from chenfucn chenfucn 1 year ago
edgchen1 edgchen1 requested a review from yihonglyu yihonglyu 1 year ago
edgchen1 edgchen1 requested a review from yufenglee yufenglee 1 year ago
yufenglee
yufenglee commented on 2024-01-03
edgchen1 make pointer const
61998ea6
edgchen1
edgchen1 commented on 2024-01-03
edgchen1 Merge remote-tracking branch 'origin/main' into edgchen1/sqnbitgemm_q…
fe7f0e70
edgchen1 remove unneeded and
d54cbd96
edgchen1 Merge remote-tracking branch 'origin/main' into edgchen1/sqnbitgemm_q…
7d8753cb
edgchen1 move code from merge conflict
6d88a0b4
edgchen1 pack quant b data
ccaa9947
edgchen1 get matmulnbits working, add docs
cff3cb47
edgchen1 Merge remote-tracking branch 'origin/main' into edgchen1/sqnbitgemm_q…
f8aba0cd
yufenglee
yufenglee commented on 2024-01-11
edgchen1 use threadpool to pack b data
33e6dd90
yufenglee
yufenglee commented on 2024-01-11
yufenglee
yufenglee commented on 2024-01-11
edgchen1 shorten names, update docs
4cd2474c
yufenglee
yufenglee commented on 2024-01-11
edgchen1 rename another function, add check for implementation in MlasSQNBitGe…
9244a3f1
edgchen1 move b_data_block_offset out of unrolled loop body
86f84ea0
yufenglee
yufenglee dismissed these changes on 2024-01-12
edgchen1
edgchen1 commented on 2024-01-12
edgchen1 move b data offset out of unrolled loop in compfp32 kernel
23373759
edgchen1 edgchen1 dismissed their stale review via 23373759 1 year ago
edgchen1 edgchen1 requested a review from yufenglee yufenglee 1 year ago
yufenglee
yufenglee approved these changes on 2024-01-12
edgchen1 edgchen1 merged 150c4cb8 into main 1 year ago
edgchen1 edgchen1 deleted the edgchen1/sqnbitgemm_quantize_a branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone