onnxruntime
Fix flash attention for GQA (Phi4)
#23850
Merged

Fix flash attention for GQA (Phi4) #23850

guschmue merged 1 commit into main from user/sushraja/fix_gqa_min
sushraja-msft
sushraja-msft Fix GQA'
aa4ac2e0
sushraja-msft sushraja-msft requested a review from qjia7 qjia7 1 year ago
sushraja-msft sushraja-msft requested a review from guschmue guschmue 1 year ago
sushraja-msft sushraja-msft marked this pull request as ready for review 1 year ago
sushraja-msft sushraja-msft changed the title Fix GQA' Fix flash attention for GQA 1 year ago
sushraja-msft sushraja-msft changed the title Fix flash attention for GQA Fix flash attention for GQA (Phi4) 1 year ago
guschmue
guschmue approved these changes on 2025-02-28
guschmue guschmue merged 1be64f88 into main 1 year ago
guschmue guschmue deleted the user/sushraja/fix_gqa_min branch 1 year ago
guschmue guschmue added ep:WebGPU

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone