PR #3572 sync : ggml - SemanticDiff

sync : ggml #3572

ggerganov merged 22 commits into master from sync-ggml-25-12-17

SOLVE_TRI extension to more dimensions (llama/17793)

43be89ef

HIP: enable mmf for RDNA3 (llama/17879)

951f8a97

ggml-cpu : fix RISC-V Q4_0 repack select and RVV feature reporting (l…

a21115d3

cann : fix ops broken by circular padding guard (llama/17825)

240cb2db

CUDA: fix overflow in MMA kernel without stream-k (llama/17939)

8d05cf47

vulkan: Allow non-pow2 n_experts in topk_moe (llama/17872)

4de56992

vulkan: Multi-pass softmax for large number of cols (llama/17892)

96fa6388

vulkan: support GGML_OP_DIAG (llama/17893)

5cb35693

vulkan: support get_rows for i32 (llama/17941)

c62c6104

ggml : arm repack fix build (llama/0)

e500fa6c

vulkan: faster q6_k matmul (llama/17813)

9a211ac9

vulkan: improve mul_mat_vec_iq1_s speed (llama/17874)

a2380a38

vulkan: Fix data race/hang in scalar/cm1 flash attention (llama/17887)

72c24f6e

vulkan: fix mul_mat_vec_iq1_s formatting (llama/18026)

631c3b39

Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/1…

6f0e8fec

llama: automatically set parameters not set by the user in such a way…

542f01ec

metal: use shared buffers on eGPU (llama/17866)

90e623c2

ggml-hexagon: mm for mtmd (llama/17894)

5cac2247

ggml : use WARP_SIZE/2 for argmax reduction offset (llama/18092)

6829ed71

llama.android : Rewrite Android binding (w/o cpu_features dep) (llama…

7e9b6ebb

sync : ggml

82ee376b

talk-llama : sync llama.cpp

fcaa8f8a

danbev approved these changes on 2025-12-17

ggerganov merged 6c22e792 into master 68 days ago

Reviewers

danbev

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

whisper.cpp sync : ggml #3572 Merged

sync : ggml #3572

whisper.cpp
sync : ggml
#3572

Merged