llama.cpp
d6a50940 - ggml-webgpu: Fix bug in FlashAttention support check (#22492)

Commit

11 days ago

ggml-webgpu: Fix bug in FlashAttention support check (#22492) * Fix flashattention support check for devices that don't support subgroups * set path to none if kv_tile doesn't fit

References

#22492 - ggml-webgpu: Fix bug in FlashAttention support check

Author

reeselevine

Parents

7b95ea5d

llama.cpp d6a50940 - ggml-webgpu: Fix bug in FlashAttention support check (#22492)

llama.cpp
d6a50940 - ggml-webgpu: Fix bug in FlashAttention support check (#22492)