llama.cpp
3c81c8de - server : print graphs reused in slot timings (#23279)

Commit
44 days ago
server : print graphs reused in slot timings (#23279) Add graphs reused counter to the per-slot timing output, printed via llama_perf_context(). Assisted-by: llama.cpp:local pi Co-authored-by: ggerganov <ggerganov@users.noreply.github.com>
Author
Parents
Loading