vulkan: optimize flash attention split_k_reduce #14554
vulkan: allow FA split_k with smaller KV values
314e0e61
vulkan: spread split_k_reduce work across more threads
8f24cd9a
0cc4m
approved these changes
on 2025-07-08
0cc4m
merged
6efcd659
into master 191 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub