Implement Flash Attention 2 for webgpu EP #23576
guschmue
dismissed these changes
on 2025-02-05
Port over FA
ed066e1f
Attempt FA2
6b978cbd
attempt to fix k-index
655c8b61
This FA works
3df32ed2
Add comments
71c8d59d
Support all sg_size and restrict FA to prefill only. On ADL, WU driveā¦
05b0f250
lint runner
362e969d
sushraja-msft
force pushed
from
dc9d7527
to
362e969d
364 days ago
Remove components
4ab90c31
sushraja-msft
dismissed their stale review
via 4ab90c31
364 days ago
remove half float notation from constants
635cd219
Fix Attention bias.
f289c644
exclude fa on devices without subgroups
a5fd8a67
guschmue
approved these changes
on 2025-02-07
guschmue
merged
82840f63
into main 363 days ago
guschmue
deleted the user/sushraja/flash_attention2 branch 363 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub