opencl: support ne3 in get_rows (llama/15866)
69943f8b
ggml webgpu: support for rope,div,sub,glu,scale,cont operators (llama…
3a5a3546
opencl: support pad_ext (llama/15888)
a57c9f69
vulkan: make ggml_vk_default_dispatcher support older vulkan headers …
032abbcc
HIP: Disable ROCWMMA fattn on CDNA when compiled against ROCWMMA 2.0.…
4d08090e
musa: update compile flags (llama/16265)
3b2df32a
model : Apertus model implementation (llama/15852)
5cd34243
ggml webgpu: add support for soft_max, optimize rms_norm (llama/16357)
cc6dc14e
vulkan: in flash attention, bounds check against nem1 (don't rely on …
3f2ecffc
vulkan: Fix FA coopmat1 invalid array indexing (llama/16365)
fe538c22
vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_SIZE (ll…
2e235913
ggml : fix graph reallocation with multiple chunks (llama/16396)
3d3000fb
metal : fix loop bound in ggml_mem_ranges (llama/16412)
5f895996
vulkan : incremental shader builds (llama/16341)
75159b53
rpc : add support for multiple devices (llama/16276)
0c56ec3e
rpc : check src buffer when copying tensor (llama/16421)
98b549d5
vulkan: use a more appropriate amount of threads when generating shad…
6e7e1b8d
ggml webgpu: actually add softmax, fix rms_norm offset (llama/16400)
72b9fa00
ggml-cpu : fix leftover handling in ggml_vec_scale_f32 for SVE (llama…
73265c03
ggml : fix unaligned access in AMX code (llama/16315)
352a07a2
metal : various optimizations + refactoring (llama/16446)
389681e7
tests : add -INF blocks to the KQ mask in the FA tests (llama/16380)
091a5c11
metal : add support for non-padded FA KV (llama/16148)
d75f9ae9
ggml webgpu: profiling, CI updates, reworking of command submission (…
c8d88fc2
metal : mark FA blocks (llama/16372)
1b7b1200
Disable CUDA host buffers on integrated GPUs (llama/16308)
57d8e6b1
refactor soft_max, add soft_max_back (llama/16472)
73b3339f
kleidiai: kernel interface refactoring (llama/16460)
ba2e955f
CANN: Improve ACL graph matching (llama/16166)
910395c5
cpu : optimize the ggml NORM operation (llama/15953)
779ca59c
cmake : Dont define XOPENSOURCE on AIX (llama/16481)
667e3645
cuda : avoid initializing unused devices (llama/16510)
d4775054
metal : fix mul-mm condition + fix mul-mv permuted kernels (llama/16494)
33f78624
sync : ggml
4f776684
talk-llama : sync llama.cpp
2ad7a695
bench : update [no ci]
55d8f017
danbev
approved these changes
on 2025-10-12
ggerganov
merged
ea174c62
into master 156 days ago
ggerganov
deleted the sync-ggml-25-11-12 branch 156 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub