llama.cpp
bcf5bda6 - Vulkan MMQ Integer Dot Refactor and K-Quant support (#16536)

Commit
132 days ago
Vulkan MMQ Integer Dot Refactor and K-Quant support (#16536) * vulkan: add mmq q2_k integer dot support * Refactor mmq caching * Reduce mmq register use * Load 4 quant blocks into shared memory in one step * Pack q2_k blocks into caches of 32 * Use 32-bit accumulators for integer dot matmul * Add q4_k mmq * Add q3_k mmq * Add q5_k mmq * Add q6_k mmq * Add mxfp4 mmq, enable MMQ MUL_MAT_ID * Fix mmv dm loads
Author
Parents
Loading