llama.cpp
opencl: add kernel to handle mat mul in attention to improve encoding speed
#17181
Merged

opencl: add kernel to handle mat mul in attention to improve encoding speed #17181

shaofeiqi
shaofeiqi Add mul_mm_f16_f32_kq_kqv kernel
9e5c5960
shaofeiqi Add ggml_cl_mul_mat_kq_kqv_adreno func
24f32df4
shaofeiqi fix whitespace
dada5171
shaofeiqi remove unused variable
0fc4b8bd
shaofeiqi remove redundant
301662b2
shaofeiqi refactor and clean up
41bf54f8
shaofeiqi shaofeiqi requested a review from lhez lhez 31 days ago
shaofeiqi shaofeiqi requested a review from max-krasnyansky max-krasnyansky 31 days ago
github-actions github-actions added ggml
github-actions github-actions added OpenCL
shaofeiqi remove trailing whitespace
b3ee2ab0
lhez
max-krasnyansky
max-krasnyansky approved these changes on 2025-11-16
max-krasnyansky max-krasnyansky merged 4db56412 into master 27 days ago
lippman1125
lhez
lippman1125

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone