SOLVE_TRI extension to more dimensions (llama/17793)
43be89ef
HIP: enable mmf for RDNA3 (llama/17879)
951f8a97
ggml-cpu : fix RISC-V Q4_0 repack select and RVV feature reporting (l…
a21115d3
cann : fix ops broken by circular padding guard (llama/17825)
240cb2db
CUDA: fix overflow in MMA kernel without stream-k (llama/17939)
8d05cf47
vulkan: Allow non-pow2 n_experts in topk_moe (llama/17872)
4de56992
vulkan: Multi-pass softmax for large number of cols (llama/17892)
96fa6388
vulkan: support GGML_OP_DIAG (llama/17893)
5cb35693
vulkan: support get_rows for i32 (llama/17941)
c62c6104
ggml : arm repack fix build (llama/0)
e500fa6c
vulkan: faster q6_k matmul (llama/17813)
9a211ac9
vulkan: improve mul_mat_vec_iq1_s speed (llama/17874)
a2380a38
vulkan: Fix data race/hang in scalar/cm1 flash attention (llama/17887)
72c24f6e
vulkan: fix mul_mat_vec_iq1_s formatting (llama/18026)
631c3b39
Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (llama/1…
6f0e8fec
llama: automatically set parameters not set by the user in such a way…
542f01ec
metal: use shared buffers on eGPU (llama/17866)
90e623c2
ggml-hexagon: mm for mtmd (llama/17894)
5cac2247
ggml : use WARP_SIZE/2 for argmax reduction offset (llama/18092)
6829ed71
llama.android : Rewrite Android binding (w/o cpu_features dep) (llama…
7e9b6ebb
sync : ggml
82ee376b
talk-llama : sync llama.cpp
fcaa8f8a
danbev
approved these changes
on 2025-12-17
ggerganov
merged
6c22e792
into master 27 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub