llama.cpp
ddcb75dd - server: add auto-sleep after N seconds of idle (#18228)

Commit
15 days ago
server: add auto-sleep after N seconds of idle (#18228) * implement sleeping at queue level * implement server-context suspend * add test * add docs * optimization: add fast path * make sure to free llama_init * nits * fix use-after-free * allow /models to be accessed during sleeping, fix use-after-free * don't allow accessing /models during sleep, it is not thread-safe * fix data race on accessing props and model_meta * small clean up * trailing whitespace * rm outdated comments
Author
Parents
Loading