onnxruntime
[webgpu] Apply Flash Attention if sliding window exceeds KV cache length
#25594
Merged

[webgpu] Apply Flash Attention if sliding window exceeds KV cache length #25594

daijh
daijh Apply flash attention if sliding window exceeds KV cache length
7a827847
daijh
daijh Fix typo
ade51b09
daijh Check sequence length
5f00414c
qjia7
qjia7 commented on 2025-07-30
guschmue guschmue added ep:WebGPU
qjia7
qjia7 commented on 2025-07-31
daijh Resolve comments
52ce1a14
qjia7
qjia7 dismissed these changes on 2025-07-31
daijh daijh dismissed their stale review via 4153d115 150 days ago
daijh Minor update comment
4153d115
guschmue
guschmue approved these changes on 2025-07-31
guschmue
azure-pipelines
daijh
guschmue guschmue merged 7cc93cf4 into main 149 days ago
daijh daijh deleted the supports-sliding-window-for-flash-attention branch 148 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone