llama.cpp
3c81c8de
- server : print graphs reused in slot timings (#23279)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
44 days ago
server : print graphs reused in slot timings (#23279) Add graphs reused counter to the per-slot timing output, printed via llama_perf_context(). Assisted-by: llama.cpp:local pi Co-authored-by: ggerganov <ggerganov@users.noreply.github.com>
References
#23279 - server : print graphs reused in slot timings
Author
ggerganov
Parents
cd963fee
Loading