sync : ggml #3478

ggerganov merged 23 commits into master from sync-ggml-25-10-14
ggerganov
JohannesGaessler CUDA: faster tile FA, add oob checks, more HSs (llama/16492)
791e60a6
sirus20x6 ggml: Correct SVE implementation in ggml_vec_dot_f16_unroll (llama/16…
7847625f
sirus20x6 ggml : Fix FP16 ELU positive branch (llama/16519)
45e26a5f
NeoZhangJianyu fix UT fault cases: count-equal, argsort, pad OPs (llama/16521)
99d07411
cern1710 metal : add opt_step_adamw and op_sum (llama/16529)
1a08f1d9
hipudding CANN: Update several operators to support FP16 data format (llama/16251)
f9de4e2a
ggerganov ggml : fix scalar path for computing norm (llama/16558)
6dd86087
cern1710 metal: add support for opt_step_sgd (llama/16539)
5ef11175
noemotiovon CANN: fix CPU memory leak in CANN backend (llama/16549)
b7c7d0c7
DamonFool ggml : fix build broken with -march=armv9-a on MacOS (llama/16520)
7ce6c536
JohannesGaessler CUDA: fix numerical issues in tile FA kernel (llama/16540)
e2b9c209
lhez opencl: fix build targeting CL 2 (llama/16554)
6839554b
ggerganov metal : FA support F32 K and V and head size = 32 (llama/16531)
d98a1645
anavp-nvidia cuda : remove legacy copy-op pointer indirection code (llama/16485)
d541d242
am17an CUDA: add fp kernel for larger batch size MoE (llama/16512)
17d67cab
am17an CUDA: use fastdiv + ggml_cuda_mad for mmvf (llama/16557)
360acc78
JohannesGaessler CUDA: enable FA for FP32 KV cache (llama/16546)
395008b4
jeffbolznv vulkan: Improve build time for MSVC (llama/16545)
0f82a3c5
jeffbolznv vulkan: Support FA with K/V in F32 (llama/16543)
c03c4348
am17an CUDA + openCL: fix bug in accessing rms_norm->src while doing fusion …
296f8c0b
SavicStefan vulkan: Add ACC_TYPE_VEC2 implementation (llama/16203)
84521445
ggerganov sync : ggml
c5d5a808
ggerganov talk-llama : sync llama.cpp
2eb25b13
danbev
danbev approved these changes on 2025-10-15
ggerganov ggerganov merged 8ba3c13b into master 156 days ago
ggerganov ggerganov deleted the sync-ggml-25-10-14 branch 156 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone