onnxruntime
Fix flash attention for GQA (Phi4)
#23850
Merged

Fix flash attention for GQA (Phi4) #23850

guschmue merged 1 commit into main from user/sushraja/fix_gqa_min
sushraja-msft
sushraja-msft Fix GQA'
aa4ac2e0
sushraja-msft sushraja-msft requested a review from qjia7 qjia7 308 days ago
sushraja-msft sushraja-msft requested a review from guschmue guschmue 308 days ago
sushraja-msft sushraja-msft marked this pull request as ready for review 308 days ago
sushraja-msft sushraja-msft changed the title Fix GQA' Fix flash attention for GQA 308 days ago
sushraja-msft sushraja-msft changed the title Fix flash attention for GQA Fix flash attention for GQA (Phi4) 308 days ago
guschmue
guschmue approved these changes on 2025-02-28
guschmue guschmue merged 1be64f88 into main 307 days ago
guschmue guschmue deleted the user/sushraja/fix_gqa_min branch 307 days ago
guschmue guschmue added ep:WebGPU

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone