llama.cpp
7f459c98 - vulkan: use fewer FA rows for small cache runs (#18280)

Commit

4 days ago

vulkan: use fewer FA rows for small cache runs (#18280)

References

#18280 - Vulkan: Tune Flash Attention for MoE on AMD GPUs

Author

0cc4m

0cc4m

Parents

Loading