onnxruntime
bb3866cf
- webgpu: support head_sink in flash attention (#27410)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
22 hours ago
webgpu: support head_sink in flash attention (#27410) This enables flash attention for gpt-oss
References
#27410 - webgpu: support head_sink in flash attention
Author
guschmue
Parents
2145c8c0
Loading