onnxruntime
[ARM CPU] Enable FP16 kernels for GQA op
#23746
Merged

Commits
  • integrated mlas kernels to gqa
    fajin-corp committed 318 days ago
  • fix build
    fajin-corp committed 318 days ago
  • fix build
    fajin-corp committed 318 days ago
  • optimize hgemm packedb
    fajin-corp committed 318 days ago
  • use 4 accumulators
    fajin-corp committed 316 days ago
  • loop parallelism for packing
    fajin-corp committed 315 days ago
  • 1> added intra loop parallelism, 2> use const conditional branch predicates, 3> cover lda/ldb/ldc in UT
    fajin-corp committed 315 days ago
  • add todo
    fajin-corp committed 315 days ago
  • added intra loop parallelism to rope
    fajin-corp committed 313 days ago
  • fix linting
    fajin-corp committed 313 days ago
  • fix build
    fajin-corp committed 313 days ago
Loading