llama.cpp
33f890e5
- vulkan: support flash attention GQA/split_k with small batches (#18938)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
49 days ago
vulkan: support flash attention GQA/split_k with small batches (#18938)
References
#18938 - vulkan: support flash attention GQA/split_k with small batches
Author
jeffbolznv
Parents
067b8d7a
Loading