onnxruntime
21325309 - [webgpu] Unify the present_sequence_length in flash attention (#25945)

Commit
99 days ago
[webgpu] Unify the present_sequence_length in flash attention (#25945) ### Description This PR unifies the present_sequence_length in flash attention and removes the dependency on total_sequence_length. This is preparation to support graph capture. #25868
Author
Parents
Loading