llama.cpp
server: add auto-sleep after N seconds of idle
#18228
Merged

server: add auto-sleep after N seconds of idle #18228

ngxson
ngxson implement sleeping at queue level
e1d7b434
ngxson implement server-context suspend
197e5785
ngxson add test
db3b78d2
github-actions github-actions added examples
github-actions github-actions added server
ngxson add docs
aea8f8c1
ngxson ngxson marked this pull request as ready for review 24 days ago
ngxson ngxson requested a review from ggerganov ggerganov 24 days ago
ngxson ngxson requested a review from ServeurpersoCom ServeurpersoCom 24 days ago
ngxson optimization: add fast path
44a5a26c
github-actions github-actions added python
ServeurpersoCom
ServeurpersoCom
ngxson
ServeurpersoCom
ServeurpersoCom
ngxson make sure to free llama_init
e6ab62c4
ngxson
ServeurpersoCom
ServeurpersoCom
ngxson nits
937b0641
ServeurpersoCom
ngxson
ServeurpersoCom
ngxson fix use-after-free
105e2f3c
ngxson
ServeurpersoCom
ngxson
ServeurpersoCom
ngxson
ServeurpersoCom
ngxson allow /models to be accessed during sleeping, fix use-after-free
fd09f880
ngxson
ngxson don't allow accessing /models during sleep, it is not thread-safe
0bb9bc48
ngxson
ServeurpersoCom
ngxson fix data race on accessing props and model_meta
d8500827
ngxson
ngxson small clean up
1663d2f8
ngxson trailing whitespace
b51da9a1
ngxson rm outdated comments
06a5ebe1
ServeurpersoCom
ServeurpersoCom
ServeurpersoCom
ServeurpersoCom
ServeurpersoCom approved these changes on 2025-12-21
ServeurpersoCom ServeurpersoCom merged ddcb75dd into master 23 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone