llama.cpp
server : host-memory prompt caching
#16391
Merged

server : host-memory prompt caching #16391

ggerganov merged 17 commits into master from gg/prompt-cache-ext
ggerganov
github-actions github-actions added examples
github-actions github-actions added server
ggerganov ggerganov force pushed from 41271998 to 0787f036 8 days ago
ggerganov ggerganov force pushed from 0787f036 to 5c0cec4c 7 days ago
tommarques56
ggerganov ggerganov force pushed from 5c0cec4c to 1440ec5c 4 days ago
ggerganov ggerganov changed the base branch from master to gg/server-checkpoints-improve 4 days ago
github-actions github-actions added python
ggerganov ggerganov force pushed from 9de83929 to cf7dd4bd 4 days ago
ggerganov
Base automatically changed from gg/server-checkpoints-improve to master 3 days ago
ggerganov minor : code style
8d518d7c
ggerganov server : fix prompt similarity calculation
ca01e7f1
ggerganov server : initial host-memory prompt caching
668a436e
ggerganov cont
32347230
ggerganov server : refactor
967b1e45
ggerganov cont
83ce8cbc
ggerganov cont : make the server task of the slot const
ba8ffa78
ggerganov cont : minor [no ci]
23b7f765
ggerganov server : cache prompts and checkpoints only for completion tasks
c32d8b40
ggerganov server : improve prompt caching logic
677b10dd
ggerganov cont : fix check for number of cached prompts [no ci]
264d2c37
ggerganov ggerganov force pushed from 65e89912 to 264d2c37 3 days ago
ggerganov server : improve caching logic, add -cram CLI arg
f42dfa45
ggerganov server : print prompt mismatch info
bf10940e
ggerganov ggerganov marked this pull request as ready for review 3 days ago
ggerganov ggerganov requested a review from ngxson ngxson 3 days ago
ggerganov cont : better naming [no ci]
bc6e238e
ggerganov server : improve prompt cache loading logic
b612f7fd
ggerganov
ggerganov server : add option to debug the slot contents (#16482)
c5e5167d
ggerganov server : add option to disable prompt cache
ff334062
ggerganov ggerganov merged d00cbea6 into master 1 day ago
ggerganov ggerganov deleted the gg/prompt-cache-ext branch 1 day ago
jukofyork

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone