llama.cpp
961e9a3e - server : do not clear slots without unified KV cache (#24190)

Commit
22 days ago
server : do not clear slots without unified KV cache (#24190) * Always export idle slots to RAM Without this, a slot's VRAM cache may not be written to RAM. If this slot happens to be busy then later on, this triggers needless preprocessing in another slot. * cont : clean-up --------- Co-authored-by: Christoph Weiss <weiss@wsoptics.de> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Author
Parents
Loading