Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
Open
Closed
[ROCM][CI] Fix AMD Examples Test Group
documentation
rocm
ci/build
#30276 opened 2025-12-08 19:35 by
Concurrensee
[NIXL] refine decoder side post process for heterogeneous BlockSize and kv_layout
v1
kv-connector
#30275 opened 2025-12-08 19:34 by
xuechendi
[AMD] Amd/deepseek aiter fusions
rocm
needs-rebase
v1
deepseek
#30274 opened 2025-12-08 19:33 by
k50112113
[Bugfix] Temporarily disable group quant rms norm fusion
#30273 opened 2025-12-08 18:34 by
ElizaWszola
[CI/Build] Use spawn subprocess for ROCm
documentation
rocm
#30272 opened 2025-12-08 18:16 by
rjrock
[ROCm][CI][Bugfix] Multi-Modal Model Support Fixes and Attention Backend Improvements
rocm
ci/build
multi-modality
qwen
#30270 opened 2025-12-08 17:00 by
AndreasKaratzas
[Bugfix] Fix DeepGEMM after #29546
ready
#30267 opened 2025-12-08 16:06 by
zhewenl
[Frontend] Fixes anthropic streaming message_start usage nesting
frontend
ready
#30266 opened 2025-12-08 15:58 by
bbartels
Fix 500 /tokenize errors and 400 v1/chat/completions errors when using truncate_prompt_tokens and sending /tokenize and v1/chat/completions requests under high concurrency
frontend
#30264 opened 2025-12-08 14:41 by
Soufiane-Ra
Multiple Hybrid KV Cache Coordinator
v1
#30263 opened 2025-12-08 14:09 by
roikoren755
Support TP which is not divded for NVFP4 kernels (flashinfer-cutlass) by adding dynamic padding
nvidia
#30260 opened 2025-12-08 13:25 by
danielafrimi
[Feature]: OpenTelemetry Metrics Support
v1
#30258 opened 2025-12-08 11:45 by
mladjan-gadzic
[bugfix][quantization] Fix fp8 per_tensor scale shape
rocm
ready
v1
#30257 opened 2025-12-08 11:37 by
haoyangli-amd
[ROCm] Use aiter.topk_sigmoid in llama4
rocm
llama
#30255 opened 2025-12-08 11:07 by
tpopp
gptq marlin quantization support for fused moe with lora
#30254 opened 2025-12-08 10:33 by
Bhanu068
fix: DeepSeek-V3.2 DeepGEMM RuntimeError
deepseek
#30251 opened 2025-12-08 09:38 by
KeeProMise
[gpt-oss] Add model_identity to system message retrieval for harmony chat template
frontend
gpt-oss
#30247 opened 2025-12-08 08:43 by
lyuwen
[Bugfix] Fix fusion for VL models
#30244 opened 2025-12-08 07:47 by
ElizaWszola
[Feature] skip language model in Encoder
qwen
#30242 opened 2025-12-08 07:09 by
Bounty-hunter
[bug] Fix "Current vLLM config is not set." warnings when FlashInfer attention is used
v1
nvidia
#30241 opened 2025-12-08 06:30 by
nvpohanh
[Bugfix] fix streaming final output for non harmony
frontend
gpt-oss
#30237 opened 2025-12-08 05:58 by
penfree
Bump actions/stale from 10.1.0 to 10.1.1
dependencies
ci/build
github_actions
#30234 opened 2025-12-08 04:39 by
dependabot[bot]
Bump actions/checkout from 6.0.0 to 6.0.1
dependencies
ci/build
github_actions
#30233 opened 2025-12-08 04:39 by
dependabot[bot]
[responsesAPI][6] Fix multi turn MCP tokenization
documentation
frontend
gpt-oss
#30230 opened 2025-12-08 03:08 by
qandrew
Fix scheduler yield on arm
#30228 opened 2025-12-08 02:59 by
wangxiyuan
[Misc] Pass kwargs to get attn_backend_cls
#30226 opened 2025-12-08 02:41 by
Potabk
[Platform] Let EPD work with non-cuda platform
nvidia
#30225 opened 2025-12-08 02:22 by
wangxiyuan
[Cleanup] Remove unused ModelRunner V1 `InputBatch.num_tokens` field
tpu
ready
v1
#30218 opened 2025-12-07 20:10 by
njhill
[LMCache] Fix breakage due to new LMCache version
ready
ci/build
kv-connector
#30216 opened 2025-12-07 18:13 by
njhill
[Feature] Auto-calculate num_redundant_experts for EPLB (#30075)
documentation
#30215 opened 2025-12-07 17:45 by
parlakisik
Older