Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
ggml-org/llama.cpp
Pull Requests
Commits
Open
Closed
convert : set "add bos" == True for Gemma 4
python
#21500 opened 2026-04-06 07:25 by
ggerganov
fix: sanitize subprocess call in bench.py
examples
python
server
#21498 opened 2026-04-06 06:07 by
orbisai0security
vocab : remove </s> eog token if gemma4
#21492 opened 2026-04-06 01:47 by
aldehir
docs: add hunyuan-ocr gguf, also add test [no ci]
documentation
examples
#21490 opened 2026-04-05 22:12 by
ngxson
mtmd: fit_params now take into account mmproj
examples
server
#21489 opened 2026-04-05 21:25 by
ngxson
vocab : add byte token handling to BPE detokenizer for Gemma4
#21488 opened 2026-04-05 21:08 by
aldehir
console: fix stripping of \n in multiline input
#21485 opened 2026-04-05 20:51 by
bipinyadav3175
llama-quantize: fix tensor-type logic
#21482 opened 2026-04-05 18:39 by
theo77186
server : handle unsuccessful sink.write in chunked stream provider
examples
server
#21478 opened 2026-04-05 17:01 by
lainon1
server: add null check for context to prevent segfault on init failure
examples
server
#21477 opened 2026-04-05 16:54 by
Anirudh171202
gguf-py: Fix lazy tensor handling for keyword arguments
python
#21476 opened 2026-04-05 16:21 by
lainon1
llama-quant: use LLM_KV constants instead of hardcoded strings
#21475 opened 2026-04-05 15:28 by
lainon1
CUDA: make cuda graphs props check faster
Nvidia GPU
ggml
#21472 opened 2026-04-05 14:13 by
am17an
ggml : fix repeat_back assert with non-contiguous gradients
ggml
#21467 opened 2026-04-05 11:25 by
RealOrko
ggml : add GGML_OP_GATHER for DeepSeek Sparse Attention (DSA) #21149
testing
ggml
#21458 opened 2026-04-05 03:55 by
LilySu
vulkan: Support GGML_TYPE_NVFP4
Vulkan
ggml
#21455 opened 2026-04-05 03:43 by
jeffbolznv
metal : add GATED_LINEAR_ATTN op
documentation
testing
ggml
Apple Metal
#21452 opened 2026-04-05 00:58 by
TheTom
Gemma 4: move some computations to BF16
model
Nvidia GPU
examples
python
ggml
#21451 opened 2026-04-05 00:24 by
pwilkin
metal: speed up Qwen3-VL image encoding on large images by ~11%
ggml
Apple Metal
#21443 opened 2026-04-04 19:12 by
Avidanborisov
eagle3: add qwen3.5 4B 9B 35B-A3B support
model
examples
python
server
#21437 opened 2026-04-04 14:35 by
36330
fix(gemma4): handle nullable type arrays
testing
#21433 opened 2026-04-04 13:55 by
gerstnr
vulkan: Tweak Xe2 warptile configuration
Vulkan
ggml
#21431 opened 2026-04-04 12:23 by
TheBlueMatt
mtmd: add Gemma 4 audio conformer encoder support
documentation
testing
Nvidia GPU
examples
ggml
#21421 opened 2026-04-04 09:54 by
stephencox-ict
model: add Zamba2 architecture support
model
#21412 opened 2026-04-04 04:58 by
echo313unfolding
fix(openvino): explicit ov::Tensor frees in ggml_backend_openvino_free
ggml
OpenVINO
#21411 opened 2026-04-04 04:25 by
thedanhoffman
cmake: add flag to use system httplib
build
#21407 opened 2026-04-04 03:21 by
WhyNotHugo
vendor : update cpp-httplib to 0.41.0
script
python
#21405 opened 2026-04-04 02:03 by
cabelo
fix tensor stride for quantized types in create_tensor
#21397 opened 2026-04-03 22:40 by
Perinban
[SYCL] Add BF16 support to GET_ROWS operation
ggml
SYCL
#21391 opened 2026-04-03 20:26 by
devedse
webui: implement pinned conversations support
examples
server
#21387 opened 2026-04-03 19:25 by
remeh
Older