llama.cpp
ef22b3e4 - docs: fix metrics endpoint description in server README (#22879)

Commit
10 days ago
docs: fix metrics endpoint description in server README (#22879) * docs: fix metrics endpoint description in server README Required model query parameter for router mode described. Removed metrics: - llamacpp:kv_cache_usage_ratio - llamacpp:kv_cache_tokens Added metrics: - llamacpp:prompt_seconds_total - llamacpp:tokens_predicted_seconds_total - llamacpp:n_decode_total - llamacpp:n_busy_slots_per_decode * server: fix metrics type for n_busy_slots_per_decode metric
Author
Parents
Loading