llama.cpp
96a712ca - Porting the improved K-Quant CUDA kernels to OpenCL (#1966)

Commit
1 year ago
Porting the improved K-Quant CUDA kernels to OpenCL (#1966) * Added broken new q4k quant * xx + ib0 * Fix q2_k fast kernel * Use preprocessor for QK_K * Add q6_k fast matmul kernel * ported q3k speedup successfully * ported q2k and q5k speedups * remove old dot kernels and template * fixed global const struct types * fixing address spaces * fixed string too long CI issue --------- Co-authored-by: 0cc4m <picard12@live.de>
Author
Parents
  • File
    ggml-opencl.cpp