onnxruntime
GQA unfused attention with FP32 QK accumulation (fixes #28195)
#28198
Merged

GQA unfused attention with FP32 QK accumulation (fixes #28195) #28198

tianleiwu merged 8 commits into main from tlwu/unfused_gqa
tianleiwu
tianleiwu GQA unfused attention with FP32 QK accumulation (fixes #28195)
99aa8b22
tianleiwu tianleiwu requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 22 days ago
tianleiwu tianleiwu requested a review from titaiwangms titaiwangms 22 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2026-04-23
tianleiwu review feedback
4fa20eff
titaiwangms
titaiwangms commented on 2026-04-23
titaiwangms
tianleiwu fix: address review feedback - SafeInt AlignTo, y_bnsh H_v, ORT_ENFORCE
eeec5123
tianleiwu fix: address review summary feedback - SafeInt, logging, tests, v_hea…
b0f71ac0
tianleiwu
tianleiwu Add C++ tests for GQA unfused attention with large head_size
778ee069
github-actions
github-actions commented on 2026-04-23
titaiwangms
titaiwangms commented on 2026-04-23
tianleiwu feedbacks
6354c525
tianleiwu
titaiwangms
titaiwangms
titaiwangms commented on 2026-04-24
titaiwangms
tianleiwu address feedback
158a004d
titaiwangms
titaiwangms dismissed these changes on 2026-04-24
tianleiwu fix build
0fcde9a9
tianleiwu tianleiwu dismissed their stale review via 0fcde9a9 21 days ago
tianleiwu tianleiwu requested a review from titaiwangms titaiwangms 21 days ago
titaiwangms
titaiwangms approved these changes on 2026-04-25
tianleiwu tianleiwu merged 997c4798 into main 20 days ago
tianleiwu tianleiwu deleted the tlwu/unfused_gqa branch 20 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone