ggml webgpu: initial flashattention implementation #18610
FlashAttention (#13)
36b5e5cc
Update to account for default kv cache padding
b6c86244
formatting shader
e5bf2d5f
reeselevine
force pushed
from
ab90db0f
to
e5bf2d5f
34 days ago
Add workflow for ggml-ci webgpu
e01f7850
Try passing absolute path to dawn in ggml-ci
e725774e
Avoid error on device destruction, add todos for proper cleanup
1eb1588c
Fix unused warning
86c0da6c
Forgot one parameter unused
286596a8
Move some flashattn computation to f32 for correctness
d8d9a1e4
ggerganov
approved these changes
on 2026-01-08
CISC
approved these changes
on 2026-01-08
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub