onnxruntime
bb3866cf - webgpu: support head_sink in flash attention (#27410)

Commit
29 days ago
webgpu: support head_sink in flash attention (#27410) This enables flash attention for gpt-oss
Author
Parents
Loading