onnxruntime
bb3866cf - webgpu: support head_sink in flash attention (#27410)

Commit
22 hours ago
webgpu: support head_sink in flash attention (#27410) This enables flash attention for gpt-oss
Author
Parents
Loading