onnxruntime
56c984ff - [webgpu] Simplify the signature of CanApplyFlashAttention (#26926)

Commit
134 days ago
[webgpu] Simplify the signature of CanApplyFlashAttention (#26926) This pull request simplifies the logic for handling present key/value tensors in the WebGPU Flash Attention implementation. The main change is that the responsibility for creating internal present key/value tensors is moved from the caller to the `ApplyFlashAttention` function itself. This reduces code duplication and makes the API easier to use. Additionally, the `CanApplyFlashAttention` function is simplified to remove unnecessary checks for present key/value tensors.
Author
Parents
Loading