llama.cpp
e6d65fb0 - vulkan: support arbitrary KV dimension in flash attention (#16160)

Commit
12 days ago
vulkan: support arbitrary KV dimension in flash attention (#16160) The "Clamp" spec constant is already based on whether KV is a multiple of Bc, so use that to control whether bounds checking is performed. Add bounds checking to the scalar and coopmat1 paths. Coopmat2 didn't need any changes (the K/V tensors are already optionally clamped, nothing else needed to be changed).
Author
Parents
Loading