Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
ggerganov/llama.cpp
Pull Requests
Commits
Open
Closed
server : fix division by zero when reporting stats
examples
server
#16501 opened 2025-10-10 15:18 by
ggerganov
vendor : sync minja
#16500 opened 2025-10-10 14:00 by
CISC
Switch to using Ubuntu 25.10 vulkan/mesa
devops
#16497 opened 2025-10-10 11:09 by
ericcurtin
Properly handle bracket delimiters for LaTeX formulas in chat message output
server/webui
examples
bugfix
server
#16496 opened 2025-10-10 10:20 by
allozaur
metal : fix mul-mm condition + fix mul-mv permuted kernels
ggml
Apple Metal
#16494 opened 2025-10-10 07:28 by
ggerganov
CUDA: faster tile FA, add oob checks, more HSs
Nvidia GPU
python
ggml
#16492 opened 2025-10-09 22:03 by
JohannesGaessler
graph : reuse SSM graphs
#16490 opened 2025-10-09 16:58 by
ggerganov
Remove Legacy Copy-OP Pointer Indirection Code
Nvidia GPU
ggml
#16485 opened 2025-10-09 12:43 by
anavp-nvidia
Add AfmoeForCausalLM support
python
#16477 opened 2025-10-08 20:58 by
bartowski1182
fix: convert_hf_to_gguf - change Jamba non-sentencepiece mode (tokeniā¦
python
#16470 opened 2025-10-08 10:20 by
amirai21
opencl: add q8_0 mm support
testing
ggml
OpenCL
#16469 opened 2025-10-08 05:45 by
lhez
vulkan: Add State Space Model (SSM) Operations Support
Vulkan
ggml
#16463 opened 2025-10-07 13:28 by
giuseppe
Add hipblasLt implementation for batched gemm to improve performance for CDNA3 only
Nvidia GPU
ggml
#16457 opened 2025-10-07 07:27 by
peizhang56
vulkan: Handle FA with all -inf mask values
Vulkan
ggml
#16447 opened 2025-10-06 19:00 by
jeffbolznv
Metal Pool 1D Kernel
testing
ggml
Apple Metal
#16429 opened 2025-10-05 09:06 by
ThoreKoritzius
fix: add generic fallback to detect trailing <think> tags in Jinja templates and handle forced-open reasoning blocks
testing
#16426 opened 2025-10-04 19:08 by
ServeurpersoCom
Implement llama-pull tool
examples
#16423 opened 2025-10-04 17:02 by
ericcurtin
contrib : add fish completions via --completion-fish
#16404 opened 2025-10-03 06:46 by
g0t4
server / ranking : add sorting and management of top_n
examples
server
#16403 opened 2025-10-03 06:37 by
YannFollet
Add ARANGE Operator to SYCL Backend (Small & Focused Changes)
ggml
SYCL
#16362 opened 2025-09-30 23:35 by
GittyBurstein
feat: render user content as markdown option
examples
server
#16358 opened 2025-09-30 19:20 by
ServeurpersoCom
SYCL SET operator optimized for F32 tensors
ggml
SYCL
#16350 opened 2025-09-30 12:56 by
GittyBurstein
Update build.md
documentation
#16346 opened 2025-09-30 06:19 by
refine360-debug
Svelte webui model selector
examples
server
#16335 opened 2025-09-29 16:29 by
ServeurpersoCom
ggml-cpu : inspect -march and -mcpu to found the CPU
ggml
#16333 opened 2025-09-29 14:05 by
angt
Enable per-conversation loading states to allow having parallel conversations
examples
server
#16327 opened 2025-09-29 10:19 by
allozaur
ggml : remove KQ mask padding
ggml
#16309 opened 2025-09-28 15:21 by
ggerganov
Add a deepwiki badge to auto-refresh the wiki-in-deepwiki weekly.
#16296 opened 2025-09-28 06:14 by
0400H
hip : substituted bpermute ops with swizzle ops (gfx906, maybe all AMD)
Nvidia GPU
ggml
#16291 opened 2025-09-27 16:10 by
iacopPBK
Update convert_hf_to_gguf_update.py
python
#16280 opened 2025-09-26 14:04 by
cpumaxx
Older