Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
huggingface/text-generation-inference
Pull Requests
Commits
Open
Closed
HuggingFaceM4/Idefics3-8B-Llama3 crash fix
#3267 opened 2025-06-16 01:54 by
sywangyi
Disable mamba in CPU platform
#3266 opened 2025-06-13 17:38 by
casassg
Migrate to V2 Pydantic interface
#3262 opened 2025-06-11 22:37 by
emmanuel-ferdman
fix multi-modality apply chat template issue
#3258 opened 2025-06-06 13:10 by
sywangyi
feat: improve llava next pooling for granite vision
#3255 opened 2025-06-04 13:54 by
drbh
Xccl
#3252 opened 2025-06-02 08:40 by
sywangyi
xpu lora support
#3232 opened 2025-05-19 02:51 by
sywangyi
Trtllm backend improvements
#3231 opened 2025-05-17 19:43 by
leejuyuu
Fix typos
#3210 opened 2025-05-06 08:42 by
omahs
feat: lock updated kernel versions
#3201 opened 2025-04-29 15:06 by
drbh
Set `uv` UV_PYTHON_INSTALL_DIR explicitly
#3197 opened 2025-04-27 17:15 by
sebastianliebscher
README: minimum Python version is 3.10
#3194 opened 2025-04-25 14:21 by
Frenzie
feat: support logit bias in chat request
#3186 opened 2025-04-22 16:20 by
drbh
Fix flashinfer plan call to use positional arguments for #3165
#3166 opened 2025-04-11 14:16 by
ruckc
Update to flashinfer 0.2.5
#3164 opened 2025-04-11 10:25 by
danieldk
Add chunked attn for L4
#3162 opened 2025-04-10 15:00 by
mht-sharma
Gaudi: add CI
#3160 opened 2025-04-10 09:06 by
baptistecolle
Update links Inferentia refer docs
#3154 opened 2025-04-09 07:34 by
guspan-tanadi
feat: align function id with tool call response
#3111 opened 2025-03-13 19:31 by
drbh
wip: comment out prepend full_text
#3079 opened 2025-03-07 00:54 by
jrc2139
Expose the real-time internal state of the batcher through SSE
#3065 opened 2025-02-27 16:01 by
mfuntowicz
Added model name label to metrics and added an optional argument --served-model-name
wontfix
#3064 opened 2025-02-27 10:50 by
yashaswipiplani
display available cached versions in TGI server error message of Neuron backend
#3063 opened 2025-02-26 23:49 by
jimburtoft
Support xccl distributed backend
#3034 opened 2025-02-18 17:43 by
dvrogozh
Fix CPU and memory affinity under external resource management
#3012 opened 2025-02-11 10:34 by
askervin
[Backend] Introduce vLLM backend
#2976 opened 2025-01-31 10:01 by
mfuntowicz
Kvrouter that will increase the kv-cache hits in case of multiple routing strategy
#2965 opened 2025-01-29 11:43 by
Narsil
misc(gha): expose action cache url and runtime as secrets
#2964 opened 2025-01-29 09:30 by
mfuntowicz
llava next image encoder to allow un-aligned patch / image sizes
#2936 opened 2025-01-22 09:10 by
jimexist
Update Dockerfile to use devel image for compatibility
#2848 opened 2024-12-16 13:00 by
YaserJaradeh
Older