onnxruntime
[ARM CPU] Enable FP16 kernels for GQA op
#23746
Merged

[ARM CPU] Enable FP16 kernels for GQA op #23746

fajin-corp merged 11 commits into main from fajin/gqa-integrate
fajin-corp
fajin-corp integrated mlas kernels to gqa
36785eff
fajin-corp fix build
922ad27f
fajin-corp fix build
9d8599b3
fajin-corp optimize hgemm packedb
d7d03c6e
fajin-corp use 4 accumulators
8f1b280f
fajin-corp loop parallelism for packing
d9fd7a25
fajin-corp 1> added intra loop parallelism, 2> use const conditional branch pred…
eadeba9c
fajin-corp add todo
6eeb0f7a
fajin-corp added intra loop parallelism to rope
f35744c3
fajin-corp fajin-corp requested a review 313 days ago
github-actions
github-actions commented on 2025-02-19
fajin-corp fix linting
c83765d1
fajin-corp fix build
9908e1eb
amarin16
amarin16 approved these changes on 2025-02-20
fajin-corp fajin-corp merged 2d33ee91 into main 311 days ago
fajin-corp fajin-corp deleted the fajin/gqa-integrate branch 311 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone