MLAS AArch64 quantized int4 Gemm kernel #18031
change AddBiasAvx to MlasAddBiasForGemm
340ba219
use std::cerr instead of logging in cpuid_uarch.cc
05c04ef2
add infrastructure for q4gemm neon impl
6fe9b81a
make test_main.cc threadpool a unique_ptr to avoid memory leak output…
f7bd0299
initial neon impl of q4gemm
85e76cea
reference implementation of MlasBlkQ4DequantBNeon
922434f2
Merge remote-tracking branch 'origin/main' into edgchen1/arm64_q4gemm
9dfa8c2d
WIP optimize MlasQ4GemmKernelNeon
6048f8c9
fix some bugs, enable implementation for other Q4Types, remove old im…
a88eedc7
Merge remote-tracking branch 'origin/main' into edgchen1/arm64_q4gemm
d83023bc
uncomment an impl
34d40a8f
remove redundant inline
0a4c4343
Merge remote-tracking branch 'origin/edgchen1/arm64_q4gemm' into edgc…
6de63398
save work - got sqnbitgemm tests and a cpu impl
01ac345d
clean up and add doc comments
d7bd7098
Merge remote-tracking branch 'origin/main' into edgchen1/arm64_q4gemm
e39dafb2
remove pragma once from cpp file
15f2b268
initial neon impl of MlasSQNBitGemmKernelNeon
a5e51ed3
add benchmark
d2682d0b
rename benchmark fn
2334da9e
Merge remote-tracking branch 'origin/main' into edgchen1/arm64_q4gemm…
f36e60e3
Fix build issue in reduceminmax.cc.
1761dffe
add block size 128 impl
fe754a60
add block size 128 tests
79934698
fix buffer type
b9c20035
save work
e90de87d
Merge remote-tracking branch 'origin/main' into edgchen1/arm64_q4gemm…
bdfc0529
add MLAS buffer size fn, remove old reference packing impl
448c4e52
fix some stuff
90f2ab5e
Merge remote-tracking branch 'origin/main' into edgchen1/arm64_q4gemm
6ed16f61
revert MlasTesterType change
93108932
remove q4gemm_neon impl
3d60b38b
add file comment
89708c5a
edgchen1
marked this pull request as ready for review 2 years ago
edgchen1
changed the title [WIP] MLAS AArch64 Q4Gemm kernels MLAS AArch64 quantized int4 Gemm kernel 2 years ago
compile warnings
07a4a574
fix another compiler warning
e061fa7f
update op attr helpers - add GetAttr returning T and clean up nodisca…
815292c9
change data members to size_t
fc54b7bb
remove unused MlasQ4BlkHasZeroPoint
608e0726
improve comment
fb756907
manual float conversion
c59e435c
revert q4gemm.h and q4gemm_avx512.cpp changes
c6241168
add block length 256 impl
c44c380b
address some PR comments
52a721a0
Merge remote-tracking branch 'origin/main' into edgchen1/arm64_q4gemm
9dee489a
update comment, fix type issue
0fce591a
add k=11008 n=11008 benchmarks
bac7bbfa
remove impl0_reference for MlasSQNBitGemmKernelNeon
f3c01627
fix unused variable
f9a31ed7
update headers, fix type
04cb1614
Merge remote-tracking branch 'origin/main' into edgchen1/arm64_q4gemm
56069ec5
address some comments
5212337c
remove b_data ptrs
97a8f58a
pass array by reference instead of returning std::array
2a6a9fcf
address comments
8458b754
yufenglee
approved these changes
on 2023-11-15
edgchen1
merged
0a4d76d9
into main 2 years ago
edgchen1
deleted the edgchen1/arm64_q4gemm branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub