onnxruntime
[MLAS][AArch64] SQ4BitGemm CompInt8 multi-block implementation
#19826
Merged

[MLAS][AArch64] SQ4BitGemm CompInt8 multi-block implementation #19826

edgchen1 merged 8 commits into main from edgchen1/sqnbitgemm_multiblock
edgchen1
edgchen1 add threads=4 benchmark tests
3ed0c314
edgchen1 blklen 32+ multi block impl
7812aa30
edgchen1 blklen 16 multi block impl
9276066c
edgchen1 make scale a vector
3c37356f
edgchen1 edgchen1 changed the title [MLAS][AArch64] SQNBitGemm CompInt8 multi-block implementation [MLAS][AArch64] SQ4BitGemm CompInt8 multi-block implementation 1 year ago
edgchen1 add cast to const uint8_t*
6038ca03
edgchen1 use correct vreinterpret call
474d11fe
edgchen1 add benchmark that gets args from environment variables
3b474b83
edgchen1 edgchen1 marked this pull request as ready for review 1 year ago
edgchen1 edgchen1 requested a review 1 year ago
edgchen1 Merge remote-tracking branch 'origin/main' into edgchen1/sqnbitgemm_m…
ec786ec5
edgchen1
azure-pipelines
yufenglee
edgchen1
yufenglee
yufenglee approved these changes on 2024-03-14
edgchen1 edgchen1 merged 0b90363a into main 1 year ago
edgchen1 edgchen1 deleted the edgchen1/sqnbitgemm_multiblock branch 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone