llama.cpp
7f459c98
- vulkan: use fewer FA rows for small cache runs (#18280)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
4 days ago
vulkan: use fewer FA rows for small cache runs (#18280)
References
#18280 - Vulkan: Tune Flash Attention for MoE on AMD GPUs
Author
0cc4m
Parents
cf2ffc02
Loading