onnxruntime
[webgpu] Optimize FlashAttention for prefill
#25395
Merged

[webgpu] Optimize FlashAttention for prefill #25395

daijh
daijh [webgpu] Optimize FlashAttention for prefill
3063b2e2
daijh
daijh
qjia7
qjia7 commented on 2025-07-15
qjia7
qjia7 commented on 2025-07-15
qjia7
qjia7 dismissed these changes on 2025-07-15
daijh Explicitly set `is_unidirectional_` to true for GQA
7bd19af3
daijh daijh dismissed their stale review via 7bd19af3 191 days ago
daijh
fs-eire
fs-eire
fs-eire approved these changes on 2025-07-22
azure-pipelines
guschmue guschmue added ep:WebGPU
guschmue guschmue merged 2bd00ec4 into main 177 days ago
daijh daijh deleted the optimize-flash-attention-for-prefill branch 177 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone