llama.cpp
OpenCL: MoE MXFP4 kernel optimizations
#16037
Merged

OpenCL: MoE MXFP4 kernel optimizations #16037

lhez merged 15 commits into ggml-org:master from CodeLinaro:moe-mxfp4-opt-1
shawngu-quic
shawngu-quic Q4_0 fix and MXFP4 optimizations
9710ef41
shawngu-quic SOA support for non-MoE mxfp4 gemm
92800104
shawngu-quic clean up
29b73d4b
shawngu-quic Keep GGML_OPENCL_SOA_Q default
76d3e84e
lhez opencl: clean up
374c3b7e
lhez opencl: fix `kernel_restore_block_mxfp4`
464ebeb1
lhez opencl: fix non adreno GPU
36676c0a
lhez opencl: recover broadcast semantic for `mul_mv_mxfp4_f32_flat`
7184682a
lhez opencl: use broadcast semantic for `mul_mv_id_mxfp4_f32_flat`
7a15e0e7
lhez opencl: fix ndst for `mul_mv_mxfp4_f32_flat` for adreno
7aa67ce8
lhez opencl: use original mxfp4 mv for structs
fe12b20c
lhez opencl: fix whitespace
b7423294
lhez opencl: fix whitespace
a69591a7
lhez opencl: fix size calculation when creating image1d_buffer_t
dbe0c3bf
github-actions github-actions added ggml
github-actions github-actions added OpenCL
lhez lhez requested a review from lhez lhez 176 days ago
lhez lhez requested a review from max-krasnyansky max-krasnyansky 176 days ago
lhez opencl: fix unused variable
7eb7e0d3
lhez
lhez approved these changes on 2025-09-18
lhez lhez merged 3edd87cd into master 174 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone