PR #3583 sync : ggml - SemanticDiff

HIP: Refactor mma for RDNA and CDNA (llama/17990)

cde9758b

ggml-cpu: ARM64: repack version of q8_0 (dotprod and i8mm) (llama/18096)

126deb39

ggml-hexagon: gelu operation (llama/17921)

2d9e9d54

ggml-hexagon: swiglu_oai operation (llama/18114)

013ebf19

remove i_major_dual (llama/18157)

b03ed79b

ggml-cpu: extend support for RVV floating-point kernels (llama/17318)

f9d36bd8

model : add ASR support for LFM2-Audio-1.5B (conformer) (llama/18106)

f09b84b0

vulkan: Add perf logger mode with concurrency (llama/17944)

baa5c1db

ggml-hexagon: Implement true Q8_0 quantization on Hexagon NPU for mor…

5a1735a7

Added comments explaining thread block size selection logic based on …

1dd0dfee

Vulkan: some improvement on mul_mat_iq2_xs (llama/18031)

216ddd8a

vulkan: in graph_optimize, try to group ADD operations (llama/18060)

3218970f

vulkan: support GGML_UNARY_OP_XIELU (llama/18062)

9a60d12d

vulkan/cuda: fix topk_moe with exp_probs_b (llama/18071)

5506ddb2

vulkan: fix im2col overflowing maxworkgroupcount (llama/18180)

e5d89abd

llama: fix RPC for -fit on (llama/18233)

092caa4c

vulkan: Implement set_tensor_async and the event interfaces (llama/18…

129f9631

vulkan: Extend rope fusions to allow mrope (llama/18264)

ad5a5115

opencl: unpack q4_0 for adreno in get_tensor (llama/18278)

ca5e155b

llamafile: add rvv support for sgemm kernels (llama/18199)

06b4ca96

ggml-hexagon: gelu optimization (llama/18151)

de7f933f

ggml-hexagon: create generalized functions for cpu side op (llama/17500)

81f95754

rpc : add check for rpc buffer type (llama/18242)

f5789f82

CANN: Uses yarn_ramp cache in ROPE (llama/17725)

56edad8e

vulkan: use fewer FA rows for small cache runs (llama/18280)

61872d57

CANN : refactor ACL graph cache (llama/17752)

de40463b

vulkan: fix command buffer corruption in ggml_backend_vk_event_wait (…

4137b226

CUDA: experimental native mxfp4 support for blackwell (llama/17906)

32cc57d3

ggml : optimize cuda cumsum fallback kernel (llama/18343)

17c42517

CANN: Add support for CONV_TRANSPOSE_1D when kernel size > 255 (llama…

a7b7da6f

ggml-cuda: fix blackwell native builds (llama/18361)

e1edaffb

cuda: optimize cumsum cub path (llama/18362)

dc1099c2

ggml-cuda: fix regex for arch list (llama/18371)

bc820e90

CANN: implement the SSM_CONV operator (llama/17737)

7c6e891b

vulkan: handle rope with large number of rows (llama/18306)

a1cc693c

vulkan: Support UPSCALE w/antialias (llama/18327)

09bd04f3

vulkan: small dequantization improvements (llama/18380)

b311303e

vulkan: Use BK=32 for coopmat2 mul_mat_id (llama/18332)

d78f3150

vulkan: optimize decodeFuncB in coopmat2 mul_mat_id shader (llama/18349)

f8dfe0ab

vulkan: preprocess mul_mat_id experts and discard workgroups more qui…

d63cb2af

ggml-cuda: Use same regex for GGML_NATIVE=OFF (llama/18407)

78b9d734

opencl: allow resizing transpose buffers (llama/18384)

6a9983bf

ggml-cuda: use CMAKE_CUDA_ARCHITECTURES if set when GGML_NATIVE=ON (l…

2971350e

cmake: Added more x86_64 CPU backends when building with `GGML_CPU_AL…

c5ef983c

rpc: fix segfault on invalid endpoint format (llama/18387)

76efccb1

Revert "ggml-cuda: use CMAKE_CUDA_ARCHITECTURES if set when GGML_NATI…

d50bcb10

HIP: Use mmq on MFMA devices for MUL_MAT_ID in cases where a lot of s…

60f59a1b

cuda: fix race condition in cumsum (llama/18448)

dccb3be2

CUDA: Blackwell features for non-native builds (llama/18436)

4795bbf8

CUDA: fix replacment of bad archs in CMake (llama/18457)

a29fc211

CUDA: add log line when mxfp4 acceleration is used (llama/18483)

db760ebb

kleidiai: add and integrate SVE 256-bit vector-length kernel (llama/1…

5f634957

Work around broken IntelSYCLConfig.cmake in Intel oneAPI 2025.x (llam…

46246528

sycl: add newline at the end of CMakeLists.txt (llama/18503)

a1780c7c

metal : remove BF16 x F16 kernels (llama/18456)

3aad8322

CUDA: fix KQ max calculation (llama/18487)

60add562

metal : add count_equal op (llama/18314)

94169eac

sync : ggml

3b318610

talk-llama : sync llama.cpp

9faea66e

danbev approved these changes on 2025-12-31

ggerganov merged 7359ac94 into master 65 days ago

ggerganov deleted the sync-ggml-25-12-31 branch 65 days ago

whisper.cpp
sync : ggml
#3583

Merged

sync : ggml #3583

whisper.cpp sync : ggml #3583 Merged

sync : ggml #3583

whisper.cpp
sync : ggml
#3583

Merged