vllm
758df5af - [NIXL][Metrics] Track `nixl_num_kv_expired_reqs` metric in Prometheus (#32340)

Commit
26 days ago
[NIXL][Metrics] Track `nixl_num_kv_expired_reqs` metric in Prometheus (#32340) Add a new metric to track the number of requests that had their KV blocks expire. The scenario is particularly important to surface and track as it is a vital indicator of the health of the deployment. Currently we're resorting to track these failures through unstructured log parsing (which is, among other thing, error string dependent); current main: > Releasing expired KV blocks for request cmpl-071d which were retrieved by 0 decode worker(s) within 0 seconds. Signed-off-by: NickLucche <nlucches@redhat.com>
Author
Parents
Loading