Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
Open
Closed
Upstream fp8 with static scales gpt oss
needs-rebase
gpt-oss
#30357 opened 2025-12-09 19:49 by
maleksan85
[CI][DeepSeek] Add nightly DeepSeek R1 `lm_eval` tests on H200
ready
ci/build
deepseek
#30356 opened 2025-12-09 18:05 by
MatthewBonanni
[Fix] Handle multiple tool calls in Qwen3-MTP tool parser
frontend
tool-calling
qwen
#30353 opened 2025-12-09 17:48 by
ArkVex
[Bugfix] Cache added_vocab to avoid per-token overhead
#30351 opened 2025-12-09 16:57 by
scratch-ml
Remove virtual engine handling
tpu
needs-rebase
v1
codex
qwen
kv-connector
#30350 opened 2025-12-09 16:34 by
WoosukKwon
[BugFix] Fix minimax m2 model rope_parameters
#30349 opened 2025-12-09 15:42 by
esmeetu
[Docs]: adds a new metric vllm:request_prefill_kv_computed_tokens in docs
documentation
#30348 opened 2025-12-09 15:24 by
googs1025
[cpu][ci] Add CPU Attention Tests for Neon Backend
#30347 opened 2025-12-09 15:14 by
fadara01
[Core] Major fix catch backend grammar exceptions (xgrammar, outlines, etc) in scheduler
v1
#30346 opened 2025-12-09 14:58 by
blancsw
Fix typos in comments across multiple files
documentation
ready
v1
#30345 opened 2025-12-09 14:56 by
wilsonwu
[Bugfix] Fix HunyuanOCR cross-image contamination in batch processing
#30344 opened 2025-12-09 14:49 by
anker-c2
[CI] refine more logic when generating and using nightly wheels & indices
ci/build
#30341 opened 2025-12-09 14:17 by
Harry-Chen
Add Eagle and Eagle3 support to Transformers modeling backend
#30340 opened 2025-12-09 14:09 by
hmellor
[CMake][Build]: Remove unused ACL CMake env variables
ci/build
#30339 opened 2025-12-09 14:09 by
Radu2k
Fix gigachat3 parser + update tests
frontend
tool-calling
#30338 opened 2025-12-09 13:37 by
ajpqs
fix: enhance human_readable_int function
#30337 opened 2025-12-09 13:28 by
andyxning
[Bugfix] Fix fp8 DeepGemm compilation issues
bug
ready
ci-failure
deepseek
#30336 opened 2025-12-09 12:41 by
ElizaWszola
[Bugfix]: Streaming i/o of batch files. Resolves #30268
frontend
ci/build
#30334 opened 2025-12-09 11:32 by
umgefahren
[Bugfix] tpu_model_runner: set vllm config context when calling reset_dynamo_cache()
tpu
ready
v1
#30331 opened 2025-12-09 11:08 by
dtrifiro
[Bugfix] Fix cuda graph sizes when running with speculative decoding
nvidia
#30330 opened 2025-12-09 10:45 by
PatrykSaffer
[Feature][CPU Backend]: Add PyTorch vectorized backend
#30329 opened 2025-12-09 10:18 by
Radu2k
[BugFix] Fix hang issue in LMCache mp mode
v1
kv-connector
#30327 opened 2025-12-09 09:57 by
wz1qqx
[Frontend] [Doc] Exclude log deltas feature
frontend
#30322 opened 2025-12-09 09:13 by
Catacomba
[BugFix] Spec decode with VLLM_ENABLE_V1_MULTIPROCESSING=0
v1
#30319 opened 2025-12-09 08:26 by
heheda12345
[Frontend] Allow users to modify the scheduler configuration online in dev mode.
frontend
v1
#30316 opened 2025-12-09 08:06 by
noooop
Generalize pooling model support with multi-task, multi-layer, multi-label classification that can be pooled from both hidden states and LM head's logits.
#30315 opened 2025-12-09 07:56 by
kflu
[fix] fix SM check for Flashinfer TRTLLM MOE
nvidia
#30314 opened 2025-12-09 07:12 by
jiahanc
[Misc][Quantization] Clarify the intent of GGUF `FusedMoE` weight materialization
ready
#30310 opened 2025-12-09 06:40 by
a4lg
[bugfix][quantization] fix quark qwen3 kv_cache quantization
ready
qwen
#30308 opened 2025-12-09 05:41 by
haoyangli-amd
Fix incomplete response generation for tool call outputs
frontend
deepseek
fb-exported
meta-exported
#30304 opened 2025-12-09 05:01 by
qandrew
Older