opencl: add flattened q6_K mv (#19054)

Commit

182 days ago

opencl: add flattened q6_K mv (#19054) * opencl: flatten `q6_K` and add `kernel_mul_mv_q6_K_f32_flat` * opencl: clean up * opencl: refactor q6_K mv - put loop body in `block_q_6_K_dot_y_flat` * opencl: tweak the workgroup size a bit * opencl: output 4 values per subgroup for `kernel_mul_mv_q6_K_f32_flat` * opencl: proper alignment for q6_K * opencl: boundary handling for flattened q6_K mv * opencl: rename q6_K mv kernel file * opencl: put flattened q6_K mv in its own file * opencl: use lower k in file name * opencl: use K in variable names

References

#19054 - opencl: add flattened q6_K mv

Author

lhez

Parents

b0311c16

llama.cpp 94eeb596 - opencl: add flattened q6_K mv (#19054)

llama.cpp
94eeb596 - opencl: add flattened q6_K mv (#19054)