llama.cpp
opencl: enable the general fp mm for non-cont input and as a fallback for specialized kqv kernel for adreno
#18970
Merged

opencl: enable the general fp mm for non-cont input and as a fallback for specialized kqv kernel for adreno #18970

lhez
lhez opencl: add `copy_to_contiguous` and utilize mm kernels
6d0a567b
lhez opencl: only copy to cont for f32 and f16 tensors
b773905f
lhez opencl: use cont mm for fallback when dst is large
861c9815
lhez opencl: use nb local to copy-to-cont
ca8a5064
lhez opencl: use local offset as well
f04a782c
github-actions github-actions added ggml
github-actions github-actions added OpenCL
lhez lhez marked this pull request as ready for review 7 days ago
lhez lhez requested a review from max-krasnyansky max-krasnyansky 7 days ago
max-krasnyansky
max-krasnyansky approved these changes on 2026-01-22
max-krasnyansky max-krasnyansky merged 9c96465f into master 6 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone