[webgpu] support smooth softmax for non-FA GQA implementation #25285
[webgpu] support smooth softmax for non-FA implementation
8d3f73ea
Add stub for smooth softmax in FlashAttention
55f364e2
fs-eire
marked this pull request as draft 314 days ago
fs-eire
changed the title [webgpu] support smooth softmax for non-FA GQA implementation [WIP][webgpu] support smooth softmax for non-FA GQA implementation 314 days ago
fs-eire
force pushed
from
f85e572b
to
dd8d91a4
314 days ago
fs-eire
force pushed
from
dd8d91a4
to
a63fc649
313 days ago
fs-eire
force pushed
from
a63fc649
to
2fed3b1e
313 days ago
fs-eire
force pushed
from
2fed3b1e
to
c201d496
313 days ago
fs-eire
force pushed
from
c201d496
to
5845d0ed
312 days ago
fs-eire
changed the title [WIP][webgpu] support smooth softmax for non-FA GQA implementation [webgpu] support smooth softmax for non-FA GQA implementation 312 days ago
fs-eire
marked this pull request as ready for review 312 days ago
Add implementation of head sink and smooth softmax
3a7b54ff
fs-eire
force pushed
from
5845d0ed
to
3a7b54ff
312 days ago
guschmue
requested changes
on 2025-07-07
resolve comments
de25b3ea
Merge remote-tracking branch 'origin/main' into fs-eire/webgpu-smooth…
43a24e80
guschmue
approved these changes
on 2025-07-07
fs-eire
merged
6d28e2d2
into main 310 days ago
fs-eire
deleted the fs-eire/webgpu-smooth-softmax branch 310 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub