llama.cpp
vulkan: support arbitrary KV dimension in flash attention
#16160

Merged

vulkan: support arbitrary KV dimension in flash attention #16160

0cc4m merged 1 commit into ggml-org:master from jeffbolznv:KV_bounds_check

jeffbolznv requested a review from

0cc4m 18 days ago

github-actions added Vulkan

github-actions added ggml

0cc4m approved these changes on 2025-09-27

vulkan: support arbitrary KV dimension in flash attention

88fea950

jeffbolznv force pushed from 48cbf213 to 88fea950 12 days ago

0cc4m merged e6d65fb0 into master 12 days ago

Reviewers

0cc4m

Assignees

No one assigned

Labels

Vulkan ggml

Milestone

No milestone