llama.cpp
61bdfd52 - server : implement prompt processing progress report in stream mode (#15827)

Commit
34 days ago
server : implement prompt processing progress report in stream mode (#15827) * server : implement `return_progress` * add timings.cache_n * add progress.time_ms * add test * fix test for chat/completions * readme: add docs on timings * use ggml_time_us Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Author
Parents
Loading