onnxruntime
Fix CUDA Attention dispatch: skip MEA when head_size != v_head_size in GQA
#28358

Merged

Fix CUDA Attention dispatch: skip MEA when head_size != v_head_size in GQA #28358

justinchuby merged 1 commit into main from fix-attention-head-size-mismatch

Fix CUDA Attention dispatch: skip MEA when head_size != v_head_size i…

32b357d2

justinchuby requested a review from

copilot-pull-request-reviewer 6 days ago

titaiwangms requested a review from

titaiwangms 6 days ago

tianleiwu approved these changes on 2026-05-05

justinchuby merged 1f257837 into main 6 days ago

justinchuby deleted the fix-attention-head-size-mismatch branch 6 days ago

Reviewers

tianleiwu

titaiwangms

Assignees

No one assigned

Labels

None yet

Milestone

No milestone