llama.cpp
a4854f03
- cont : improve n_cmpl logic
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
14 days ago
cont : improve n_cmpl logic - launch the parent task first so it finds the slot with best cache - parent task waits for child tasks to be launched - when a child task finishes - remove its cache
References
#18663 - server: fix n_cmpl not skipping processing prompt
Author
ggerganov
Committer
ggerganov
Parents
f2d988db
Loading