llama.cpp
bcf5bda6 - Vulkan MMQ Integer Dot Refactor and K-Quant support (#16536)

Commit

253 days ago

Vulkan MMQ Integer Dot Refactor and K-Quant support (#16536) * vulkan: add mmq q2_k integer dot support * Refactor mmq caching * Reduce mmq register use * Load 4 quant blocks into shared memory in one step * Pack q2_k blocks into caches of 32 * Use 32-bit accumulators for integer dot matmul * Add q4_k mmq * Add q3_k mmq * Add q5_k mmq * Add q6_k mmq * Add mxfp4 mmq, enable MMQ MUL_MAT_ID * Fix mmv dm loads

References

#16536 - Vulkan MMQ Integer Dot Refactor and K-Quant support

Author

0cc4m

Parents

3eb2be1c

llama.cpp bcf5bda6 - Vulkan MMQ Integer Dot Refactor and K-Quant support (#16536)

llama.cpp
bcf5bda6 - Vulkan MMQ Integer Dot Refactor and K-Quant support (#16536)