[ARM CPU] Enable FP16 kernels for GQA op #23746
integrated mlas kernels to gqa
36785eff
fix build
922ad27f
fix build
9d8599b3
optimize hgemm packedb
d7d03c6e
use 4 accumulators
8f1b280f
loop parallelism for packing
d9fd7a25
1> added intra loop parallelism, 2> use const conditional branch pred…
eadeba9c
add todo
6eeb0f7a
added intra loop parallelism to rope
f35744c3
fix linting
c83765d1
fix build
9908e1eb
amarin16
approved these changes
on 2025-02-20
fajin-corp
deleted the fajin/gqa-integrate branch 311 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub