llama.cpp
438a8392 - vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595)

Commit
319 days ago
vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595) * vulkan: implement specialized MMV kernels for IQ2 quantizations * vulkan: add MMV kernels for IQ3 quants * vulkan: Increase MMV batch size and unroll IQ LUT setup * vulkan: fix init_iq_shmem for WG sizes larger than tables * vulkan: common batch size for all I-quants
Parents
Loading