PR #3526 sync : ggml - SemanticDiff

vulkan: fix memory allocations (llama/17122)

6e2d45a4

cuda/vulkan : bicubic interpolation (llama/17022)

ce8d1da2

arm64: add i8mm route with SVE ggml_vec_dot_q4_K_q8_K and ggml_vec_do…

5c643599

metal : enable tensor API for A19 (llama/17087)

4cd5695c

vulkan: fix validation issue introduced by #16868 (llama/17145)

a64712e9

vulkan: check glslc executable string (llama/17144)

d1a83fbf

ggml-cpu : inspect -march and -mcpu to found the CPU (llama/16333)

e4c1e3cd

metal : cap threadgroups size of set_rows (llama/17146)

4413a561

cpu: skip NOPs to avoid barriers (llama/17133)

becc46e7

opencl: add fastdiv and use it in set_rows, ported from cuda (llama/1…

485e4235

cmake : add version to all shared object files (llama/17091)

bee75186

kleidiai: add optimized per-channel kernels for Q8_0 (llama/16993)

2fe28b67

ggml-cpu: templateify ggml_compute_forward_rope_f32 and _f16 (llama/1…

abbb5f2a

ggml-cpu : add RISC-V RVV (Zvfh) optimization for FP16 to FP32 conver…

f52e7c75

disable rms norm mul rope for chips with no fp16 rte (llama/17134)

c3a1298c

hexagon: various Op fixes (llama/17135)

32d1b349

fix ci crash about SSM_CONV (llama/17169)

e9df9581

CANN: Add L2_NORM op support (llama/16856)

2f2c6c3b

ggml-cpu: handle 3d tensors in repack mat_mul (llama/17030)

6a2c71b9

ggml : use std::sort in ggml_argsort CPU implementation (llama/17211)

a541b0ef

CUDA: static assert to prevent misuse of memcpy_1 (llama/17198)

214d1af0

CUDA: fuse rope + set_rows (llama/16884)

be4d1303

CANN: Add cross_entropy_loss op support (llama/16886)

c880b430

ggml-cpu : use template for argsort (llama/17222)

9808706d

Revert "ggml-cpu: handle 3d tensors in repack mat_mul (llama/17030)" …

b6d0ebe2

metal: accelerated conv2d (llama/17175)

5150c23e

ggml-cpu : add RISC-V vector intrinsic support for silu and cvar oper…

273dd3fe

sched : fix reserve ignoring user tensor assignments (llama/17232)

312480c9

vulkan: remove shell call from vulkan-shaders-gen tool, revert file c…

1b4c6ad1

ggml : add ops SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM (llama/17063)

e9b37f56

ggml-cpu: handle 3d tensors in repack mat_mul (llama/17241)

3e4ae291

metal : make the FA extra sizes consistent (llama/17143)

ae08083e

metal : support argsort for ne00 > 1024 (llama/17247)

a6f1d807

vulkan: change graph_compute to be async and enable get_tensor_async …

786e0056

vulkan: skip all-negative-inf blocks in FA (llama/17186)

9d3fa94c

vulkan: Use ggml_vk_tensor_subbuffer in mul_mat_vec(id) paths (llama/…

a175f857

vulkan: implement ABS and NEG (llama/17245)

89f82bfe

vulkan: Replace 16-bit unpack8 calls to work around legacy Windows AM…

5ae41738

vulkan: Fuse mul_mat_id+add_id+mul and mul_mat+add+add. (llama/17287)

5d9fba0a

sycl : unify unary kernels with a generic implementation and enable w…

d7356143

opencl: add kernel to handle mat mul in attention to improve encoding…

14dac59d

opencl: fix rms_norm_mul (llama/17250)

9c2bde0f

metal : remove obosolete asserts (llama/17295)

844275a8

vulkan: fix MMQ quantize_y condition (llama/17301)

75cfe4a6

vulkan: add LOG operation support for F32 and F16 (llama/17183)

4f694e4f

CANN: Use smart pointers to manage ACL objects (llama/17238)

7e090958

metal : add cumsum (llama/17305)

25182a79

metal : faster argsort (llama/17315)

8208359a

metal : support I32 -> I32 copy (llama/17317)

714c1ba1

sync : ggml

36b80f63

sync : llama.cpp

3e980fd5

danbev approved these changes on 2025-11-17

ggerganov merged b12abefa into master 109 days ago

ggerganov deleted the sync-ggml-25-11-17 branch 109 days ago

whisper.cpp
sync : ggml
#3526

Merged

sync : ggml #3526

whisper.cpp sync : ggml #3526 Merged

sync : ggml #3526

whisper.cpp
sync : ggml
#3526

Merged