llama.cpp
server : support unified cache across slots
#16736
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
14
Changes
View On
GitHub
server : support unified cache across slots
#16736
ggerganov
merged 14 commits into
master
from
gg/server-unified-slots
github-actions
added
examples
github-actions
added
server
slaren
commented on 2025-10-23
github-actions
added
python
ggerganov
force pushed
from
7a25d4b5
46 days ago
ggerganov
force pushed
46 days ago
ggerganov
force pushed
46 days ago
ggerganov
force pushed
to
6369fe09
46 days ago
github-actions
added
testing
ggerganov
force pushed
from
6369fe09
to
ac261bea
45 days ago
ggerganov
commented on 2025-10-29
ggerganov
force pushed
from
ac261bea
44 days ago
ggerganov
force pushed
to
4e9e319b
44 days ago
ggerganov
marked this pull request as ready for review
44 days ago
ggerganov
requested a review
from
ngxson
44 days ago
ggerganov
requested a review
from
CISC
44 days ago
ggerganov
requested a review
from
slaren
44 days ago
slaren
commented on 2025-11-01
ngxson
commented on 2025-11-01
server : support unified context across slots
57ece5ba
cont : fix speculative decoding initialization
a42fb771
context : fix n_ctx_per_seq computation
492f628c
server : purge slots one by one
8222e9c2
tests : add unified cache server tests
21791750
llama : update per-seq context computation
f0f105ff
test-thread-safety : handle tiny training context of the input model
e7b7cbfb
server : fix server_tokens clear()
290f6a9f
server : use 4 slots + unified KV by default
23323cd1
llama : add note about context size queries
f2cca024
cont : update todos [no ci]
ff684363
context : do not cap the size of the context
c08d0d14
ggerganov
force pushed
from
93373cc5
to
c08d0d14
42 days ago
tests : adjust parameters to be CI friendlier
356dc08b
slaren
approved these changes on 2025-11-01
context : add warning
56fceee2
ggerganov
merged
cd5e3b57
into master
41 days ago
ggerganov
deleted the gg/server-unified-slots branch
41 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
slaren
ngxson
julmb
CISC
Assignees
No one assigned
Labels
testing
examples
python
server
Milestone
No milestone
Login to write a write a comment.
Login via GitHub