llama.cpp
server : host-memory prompt caching
#16391
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
17
Changes
View On
GitHub
server : host-memory prompt caching
#16391
ggerganov
merged 17 commits into
master
from
gg/prompt-cache-ext
github-actions
added
examples
github-actions
added
server
ggerganov
force pushed
from
41271998
to
0787f036
8 days ago
ggerganov
force pushed
from
0787f036
to
5c0cec4c
7 days ago
ggerganov
force pushed
from
5c0cec4c
to
1440ec5c
4 days ago
ggerganov
changed the base branch from
master
to
gg/server-checkpoints-improve
4 days ago
github-actions
added
python
ggerganov
force pushed
from
9de83929
to
cf7dd4bd
4 days ago
Base automatically changed from
gg/server-checkpoints-improve
to
master
3 days ago
minor : code style
8d518d7c
server : fix prompt similarity calculation
ca01e7f1
server : initial host-memory prompt caching
668a436e
cont
32347230
server : refactor
967b1e45
cont
83ce8cbc
cont : make the server task of the slot const
ba8ffa78
cont : minor [no ci]
23b7f765
server : cache prompts and checkpoints only for completion tasks
c32d8b40
server : improve prompt caching logic
677b10dd
cont : fix check for number of cached prompts [no ci]
264d2c37
ggerganov
force pushed
from
65e89912
to
264d2c37
3 days ago
server : improve caching logic, add -cram CLI arg
f42dfa45
server : print prompt mismatch info
bf10940e
ggerganov
marked this pull request as ready for review
3 days ago
ggerganov
requested a review
from
ngxson
3 days ago
cont : better naming [no ci]
bc6e238e
server : improve prompt cache loading logic
b612f7fd
server : add option to debug the slot contents (#16482)
c5e5167d
server : add option to disable prompt cache
ff334062
ggerganov
merged
d00cbea6
into master
1 day ago
ggerganov
deleted the gg/prompt-cache-ext branch
1 day ago
Login to write a write a comment.
Login via GitHub
Reviewers
ngxson
Assignees
No one assigned
Labels
examples
python
server
Milestone
No milestone
Login to write a write a comment.
Login via GitHub