OpenCL: MoE MXFP4 kernel optimizations #16037
Q4_0 fix and MXFP4 optimizations
9710ef41
SOA support for non-MoE mxfp4 gemm
92800104
clean up
29b73d4b
Keep GGML_OPENCL_SOA_Q default
76d3e84e
opencl: clean up
374c3b7e
opencl: fix `kernel_restore_block_mxfp4`
464ebeb1
opencl: fix non adreno GPU
36676c0a
opencl: recover broadcast semantic for `mul_mv_mxfp4_f32_flat`
7184682a
opencl: use broadcast semantic for `mul_mv_id_mxfp4_f32_flat`
7a15e0e7
opencl: fix ndst for `mul_mv_mxfp4_f32_flat` for adreno
7aa67ce8
opencl: use original mxfp4 mv for structs
fe12b20c
opencl: fix whitespace
b7423294
opencl: fix whitespace
a69591a7
opencl: fix size calculation when creating image1d_buffer_t
dbe0c3bf
lhez
requested a review
from
lhez
176 days ago
opencl: fix unused variable
7eb7e0d3
lhez
approved these changes
on 2025-09-18
lhez
merged
3edd87cd
into master 174 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub