onnxruntime
[webgpu] Flash attention for generation
#23808
Merged

[webgpu] Flash attention for generation #23808

guschmue merged 18 commits into main from attention_generate_fa
qjia7
github-actions
github-actions commented on 2025-02-25
guschmue guschmue added ep:WebGPU
qjia7 [webgpu] Add flash decoding
f0424fd1
qjia7 qjia7 force pushed from 6f6d6d15 to f0424fd1 317 days ago
qjia7 fix CI errors
48affd3b
qjia7 qjia7 changed the title [WIP] Flash attention for generation [webgpu] Flash attention for generation 316 days ago
qjia7 limit it to static kv cache
96aaa897
qjia7 qjia7 requested a review from sushraja-msft sushraja-msft 316 days ago
qjia7 qjia7 requested a review from guschmue guschmue 316 days ago
qjia7 qjia7 marked this pull request as ready for review 316 days ago
qjia7 Merge branch 'main' into attention_generate_fa_good
a97ad569
qjia7 remove the limitations
40aa7ada
qjia7 Merge branch 'main' into attention_generate_fa_good
99df2e9d
qjia7 Use 1D dispatch group size
e9c18db9
qjia7 qjia7 marked this pull request as draft 308 days ago
qjia7 add annotations
c96e925d
qjia7 qjia7 marked this pull request as ready for review 308 days ago
qjia7 Use simialr var name with matmul
0fb5c2f0
guschmue
qjia7 Merge branch 'main' into attention_generate_fa_good
0d3a7381
qjia7
qjia7 update cache hints
7cbed5fa
sushraja-msft
sushraja-msft requested changes on 2025-03-26
qjia7 address comments
2526992b
qjia7
qjia7 commented on 2025-03-27
qjia7 qjia7 requested a review from sushraja-msft sushraja-msft 299 days ago
sushraja-msft
sushraja-msft commented on 2025-03-31
qjia7 address comments
922ca1b9
qjia7 Merge branch 'main' into attention_generate_fa_good
51595805
qjia7 Rename XXXSplitK to XXXSplitVxScore
2f4a1f79
qjia7 Modify the comments
acbf5442
sushraja-msft
sushraja-msft approved these changes on 2025-04-02
sushraja-msft
sushraja-msft dismissed these changes on 2025-04-04
guschmue
guschmue dismissed these changes on 2025-04-04
guschmue
qjia7 Merge branch 'main' into attention_generate_fa_good
639a2abd
qjia7 address comments
191cf414
qjia7 qjia7 dismissed their stale review via 191cf414 289 days ago
qjia7 qjia7 dismissed their stale review via 191cf414 289 days ago
qjia7
qjia7 commented on 2025-04-07
qjia7 qjia7 requested a review from sushraja-msft sushraja-msft 289 days ago
qjia7 qjia7 requested a review from guschmue guschmue 289 days ago
sushanthr
sushanthr approved these changes on 2025-04-08
guschmue
guschmue approved these changes on 2025-04-08
guschmue guschmue removed review request from sushraja-msft sushraja-msft 288 days ago
guschmue guschmue merged 18f91e55 into main 288 days ago
guschmue guschmue deleted the attention_generate_fa branch 288 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone