Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
huggingface/text-generation-inference
Pull Requests
Commits
Open
Closed
Expose the real-time internal state of the batcher through SSE
#3065 opened 2025-02-27 16:01 by
mfuntowicz
Added model name label to metrics and added an optional argument --served-model-name
wontfix
#3064 opened 2025-02-27 10:50 by
yashaswipiplani
display available cached versions in TGI server error message of Neuron backend
#3063 opened 2025-02-26 23:49 by
jimburtoft
Support xccl distributed backend
#3034 opened 2025-02-18 17:43 by
dvrogozh
Fix CPU and memory affinity under external resource management
#3012 opened 2025-02-11 10:34 by
askervin
Kvrouter that will increase the kv-cache hits in case of multiple routing strategy
#2965 opened 2025-01-29 11:43 by
Narsil
Update Dockerfile to use devel image for compatibility
#2848 opened 2024-12-16 13:00 by
YaserJaradeh
Enable qwen2vl video
#2756 opened 2024-11-18 17:59 by
drbh
[WIP] Add gfx1100 support to AMD pytorch build
#2642 opened 2024-10-13 06:11 by
cazlo
Add model_load_time metric
#2311 opened 2024-07-26 00:48 by
Edwinhr716
Newer