llama.cpp
c37052ab - vulkan: mul_mat_id coopmat2 optimizations (#15546)

Commit
7 days ago
vulkan: mul_mat_id coopmat2 optimizations (#15546) * vulkan: mul_mat_id coopmat2 optimizations Add a path for when the tile fits in BN/2, similar to what we have for mul_mat. Only call fetch_scales/store_scales once per QUANT_K block, and once at the beginning in case start_k is not aligned. * Also add a path for BN/4 - worth a couple more percent
Author
Parents
Loading