onnxruntime
[aarch64] Add Sbgemm kernel to accelerate fp32 tensor matmul with bfloat16
#17031

Merged

[aarch64] Add Sbgemm kernel to accelerate fp32 tensor matmul with bfloat16 #17031

snnn merged 6 commits into microsoft:main from snadampal:sbgemm_aarch64

snadampal requested a review 2 years ago

snadampal force pushed from be947da3 to 2aef2f38 2 years ago

snadampal force pushed from 2aef2f38 to 9b51325a 2 years ago

yufenglee commented on 2023-09-26

chenfucn commented on 2023-09-26

yufenglee commented on 2023-09-26

snadampal force pushed from 9b51325a to cb12f7e1 2 years ago

snadampal force pushed from cb12f7e1 to 224c5b3a 2 years ago

snadampal force pushed from 224c5b3a to eb257ffa 2 years ago

snadampal force pushed from eb257ffa to 83a6f6ed 2 years ago

snadampal force pushed from 83a6f6ed to 2fffd446 2 years ago

chenfucn commented on 2023-10-12

yufenglee commented on 2023-10-12

chenfucn commented on 2023-10-12

snadampal force pushed from 2fffd446 to cef62df2 2 years ago

snadampal force pushed from 94a41a60 to 0552d341 2 years ago

snadampal force pushed from 0552d341 to 76360f30 2 years ago

snadampal force pushed from 76360f30 to e83242d9 2 years ago

snadampal force pushed from e83242d9 to 8371c763 2 years ago

snnn commented on 2023-11-29

snadampal force pushed from 8371c763 to 5bed8f08 2 years ago

snadampal force pushed from 5bed8f08 to 4c2c22b1 2 years ago

github-advanced-security commented on 2023-12-12

snadampal force pushed from 4c2c22b1 to 468f3094 2 years ago

snnn commented on 2023-12-19

skottmckay commented on 2023-12-19

snadampal force pushed from 2c04c37c to f45ef1da 2 years ago

yufenglee commented on 2024-01-19

yufenglee added release:1.17.0

yufenglee commented on 2024-01-19

chenfucn commented on 2024-01-19

define aarch64 bf16 hwcaps checks in cpuinfo and platform

5240363b

Add SBGEMM kernel to accelerate fp32 gemm with bfloat16

f8027c93

Integrate aarch64 bfloat16 sbgemm kernel into CPU EP MatMul operator

037052e7

add mlas unittests for sbgemm kernel

6376bfaa

add optimizer QDQ Transformer MatMul tests for sbgemm fastmath mode

9aca49a0

add ort execution provider math op matmul tests for sbgemm fastmath mode

d6d48c39

snadampal force pushed from f45ef1da to d6d48c39 2 years ago

chenfucn approved these changes on 2024-01-22

snnn approved these changes on 2024-01-22

snnn merged 77da2ef2 into main 2 years ago

snnn removed release:1.17.0

Reviewers

chenfucn

snnn

yufenglee

skottmckay

github-advanced-security

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

onnxruntime [aarch64] Add Sbgemm kernel to accelerate fp32 tensor matmul with bfloat16 #17031 Merged

[aarch64] Add Sbgemm kernel to accelerate fp32 tensor matmul with bfloat16 #17031

onnxruntime
[aarch64] Add Sbgemm kernel to accelerate fp32 tensor matmul with bfloat16
#17031

Merged