sync : ggml #3572

ggerganov merged 22 commits into master from sync-ggml-25-12-17
ggerganov
pwilkin SOLVE_TRI extension to more dimensions (llama/17793)
43be89ef
zhang-hui-yulo HIP: enable mmf for RDNA3 (llama/17879)
951f8a97
ixgbe ggml-cpu : fix RISC-V Q4_0 repack select and RVV feature reporting (l…
a21115d3
CISC cann : fix ops broken by circular padding guard (llama/17825)
240cb2db
JohannesGaessler CUDA: fix overflow in MMA kernel without stream-k (llama/17939)
8d05cf47
jeffbolznv vulkan: Allow non-pow2 n_experts in topk_moe (llama/17872)
4de56992
jeffbolznv vulkan: Multi-pass softmax for large number of cols (llama/17892)
96fa6388
jeffbolznv vulkan: support GGML_OP_DIAG (llama/17893)
5cb35693
jeffbolznv vulkan: support get_rows for i32 (llama/17941)
c62c6104
ggerganov ggml : arm repack fix build (llama/0)
e500fa6c
netrunnereve vulkan: faster q6_k matmul (llama/17813)
9a211ac9
lovedheart vulkan: improve mul_mat_vec_iq1_s speed (llama/17874)
a2380a38
jeffbolznv vulkan: Fix data race/hang in scalar/cm1 flash attention (llama/17887)
72c24f6e
0cc4m vulkan: fix mul_mat_vec_iq1_s formatting (llama/18026)
631c3b39
NeoZhangJianyu Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/1…
6f0e8fec
JohannesGaessler llama: automatically set parameters not set by the user in such a way…
542f01ec
jdemeule metal: use shared buffers on eGPU (llama/17866)
90e623c2
joeldushouyu ggml-hexagon: mm for mtmd (llama/17894)
5cac2247
Aadeshveer ggml : use WARP_SIZE/2 for argmax reduction offset (llama/18092)
6829ed71
naco-siren llama.android : Rewrite Android binding (w/o cpu_features dep) (llama…
7e9b6ebb
ggerganov sync : ggml
82ee376b
ggerganov talk-llama : sync llama.cpp
fcaa8f8a
danbev
danbev approved these changes on 2025-12-17
ggerganov ggerganov merged 6c22e792 into master 27 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone