Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
ggerganov/llama.cpp
Pull Requests
Commits
Open
Closed
[SYCL] Add OP im2col_3d
documentation
ggml
merge ready
SYCL
#22903 by
arthw
was merged 2026-05-11 05:01
vendor : update cpp-httplib to 0.43.4
script
python
merge ready
#22888 by
cabelo
was merged 2026-05-10 16:46
feat: add MiMo v2.5 vision
examples
python
#22883 by
AesSedai
was merged 2026-05-12 09:11
speculative: fix multimodal MTP seed positions
model
testing
Nvidia GPU
Vulkan
examples
python
server
ggml
Apple Metal
#22881 by
trbom5c
was closed 2026-05-09 19:51
docs: fix metrics endpoint description in server README
examples
server
#22879 by
willjoha
was merged 2026-05-11 16:32
server : fix n_predict=-2 (generate until context full)
examples
python
server
#22873 by
kimjune01
was closed 2026-05-11 21:27
model : fix model type check for granite/llama3 and deepseek2/glm4.7 lite
model
merge ready
#22870 by
CISC
was merged 2026-05-10 06:44
updated nix flake
nix
devops
merge ready
#22869 by
yuannan
was merged 2026-05-09 14:15
resync
python
#22866 by
NeosDumb
was closed 2026-05-09 07:32
Perf audit and tweaking
ggml
SYCL
#22864 by
sanmai
was closed 2026-05-09 07:00
WebUI: make the title cool
server/webui
examples
server
#22859 by
foldl
was closed 2026-05-09 03:51
llama-quant : add missing NVFP4 default type mapping
examples
#22858 by
t-timms
was closed 2026-05-10 07:24
opencl: add q4_1 MoE for Adreno
ggml
OpenCL
#22856 by
shawngu-quic
was merged 2026-05-11 18:57
Mtp clean
documentation
model
script
testing
android
server/webui
Nvidia GPU
Vulkan
examples
python
devops
server
ggml
SYCL
Apple Metal
Ascend NPU
WebGPU
#22853 by
adybag14-cyber
was closed 2026-05-08 21:29
Add preemptive priority scheduling
server/webui
examples
python
server
#22851 by
lowlyocean
was closed 2026-05-12 19:40
webui: fix LLM title generation for agentic conversations
server/webui
examples
server
#22840 by
smugman-dot
was merged 2026-05-08 14:36
update BoringSSL to 0.20260508.0
#22839 by
cabelo
was merged 2026-05-09 07:26
spec : parallel drafting support
examples
server
#22838 by
ggerganov
was merged 2026-05-11 16:09
hexagon: add HTP kernel for GGML_OP_GATED_DELTA_NET
ggml
Hexagon
#22837 by
wyanzhao
was merged 2026-05-09 00:12
convert : add split() to LoraTorchTensor in LoRA converter
python
merge ready
#22832 by
jesus-talavera-ibm
was merged 2026-05-12 05:17
common : do not wrap raw strings in schema parser for tagged parsers
#22827 by
aldehir
was merged 2026-05-08 20:33
ggml: update SCHED_DEBUG output to use ggml_op_desc()
ggml
#22825 by
max-krasnyansky
was merged 2026-05-08 05:43
CUDA: lower-case PCI bus id, standardize for ggml
Nvidia GPU
ggml
#22820 by
JohannesGaessler
was merged 2026-05-08 08:09
convert : fix RuntimeError when stripping FP8 KV-cache scales
python
merge ready
#22818 by
pich
was merged 2026-05-08 03:55
Feature hexagon l2 norm
ggml
Hexagon
#22816 by
pdhinaka
was merged 2026-05-08 20:41
Add flash attention MMA / Tiles to support MiMo-V2.5
testing
Nvidia GPU
python
ggml
#22812 by
AesSedai
was merged 2026-05-09 03:28
ggml-virtgpu: include missing mutex header
ggml
merge ready
#22810 by
olliewalsh
was merged 2026-05-10 15:32
ggml-webgpu: address precision issues for multimodal
ggml
WebGPU
#22808 by
Constannnnnt
was merged 2026-05-12 14:27
llama : fix device state save/load
ggml
Apple Metal
#22805 by
ggerganov
was merged 2026-05-07 18:43
Gemma4_26B_A4B_NvFp4 hf checkpoint convert to gguf format fixes
model
python
#22804 by
ynankani
was merged 2026-05-08 18:42
Newer
Older