[aarch64] Add Sbgemm kernel to accelerate fp32 tensor matmul with bfloat16 #17031
snadampal
force pushed
from
be947da3
to
2aef2f38
2 years ago
snadampal
force pushed
from
2aef2f38
to
9b51325a
2 years ago
snadampal
force pushed
from
9b51325a
to
cb12f7e1
2 years ago
snadampal
force pushed
from
cb12f7e1
to
224c5b3a
2 years ago
snadampal
force pushed
from
224c5b3a
to
eb257ffa
2 years ago
snadampal
force pushed
from
eb257ffa
to
83a6f6ed
2 years ago
snadampal
force pushed
from
83a6f6ed
to
2fffd446
2 years ago
snadampal
force pushed
from
2fffd446
to
cef62df2
2 years ago
snadampal
force pushed
from
94a41a60
to
0552d341
2 years ago
snadampal
force pushed
from
0552d341
to
76360f30
2 years ago
snadampal
force pushed
from
76360f30
to
e83242d9
2 years ago
snadampal
force pushed
from
e83242d9
to
8371c763
2 years ago
snnn
commented
on 2023-11-29
snadampal
force pushed
from
8371c763
to
5bed8f08
2 years ago
snadampal
force pushed
from
5bed8f08
to
4c2c22b1
2 years ago
snadampal
force pushed
from
4c2c22b1
to
468f3094
2 years ago
snnn
commented
on 2023-12-19
snadampal
force pushed
from
2c04c37c
to
f45ef1da
2 years ago
define aarch64 bf16 hwcaps checks in cpuinfo and platform
5240363b
Add SBGEMM kernel to accelerate fp32 gemm with bfloat16
f8027c93
Integrate aarch64 bfloat16 sbgemm kernel into CPU EP MatMul operator
037052e7
add mlas unittests for sbgemm kernel
6376bfaa
add optimizer QDQ Transformer MatMul tests for sbgemm fastmath mode
9aca49a0
add ort execution provider math op matmul tests for sbgemm fastmath mode
d6d48c39
snadampal
force pushed
from
f45ef1da
to
d6d48c39
2 years ago
chenfucn
approved these changes
on 2024-01-22
snnn
approved these changes
on 2024-01-22
snnn
merged
77da2ef2
into main 2 years ago
snnn
removed release:1.17.0
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub