llama.cpp
4db56412 - opencl: add kernel to handle mat mul in attention to improve encoding speed (#17181)

Commit
25 days ago
opencl: add kernel to handle mat mul in attention to improve encoding speed (#17181) * Add mul_mm_f16_f32_kq_kqv kernel * Add ggml_cl_mul_mat_kq_kqv_adreno func * fix whitespace * remove unused variable * remove redundant * refactor and clean up * remove trailing whitespace
Author
Parents
Loading