onnxruntime
[webgpu] Optimize flash decoding by merging QKT and SplitVx shader
#25929
Open

[webgpu] Optimize flash decoding by merging QKT and SplitVx shader #25929

xiaofeihan1
xiaofeihan1 implement
0a3da97d

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone