Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
Open
Closed
[BugFix] Fix minimax m2 model rotary_dim
ready
#30384 opened 2025-12-10 10:37 by
rogeryoungh
adding constraint updates of cos-sin to improve mrope performance
#30377 opened 2025-12-10 08:48 by
wujinyuan1
[Fix]fix import error from lmcache
kv-connector
#30376 opened 2025-12-10 08:38 by
wz1qqx
Implement LMDB-based multi-modal cache
ci/build
v1
multi-modality
#30373 opened 2025-12-10 07:21 by
petersalas
[Fix] Add default rope theta for qwen1 model
qwen
#30369 opened 2025-12-10 02:36 by
iwzbi
[Bug Fix] Fix Kimi-Linear model initialization crash due to missing 'indexer_rotary_emb' arg
#30366 opened 2025-12-10 00:02 by
yonasTMC
fix(gguf): Auto-select compatible dtype for GGUF models on Blackwell
#30365 opened 2025-12-09 23:59 by
kitaekatt
[Bugfix] awq_gemm: fix argument order swap
#30364 opened 2025-12-09 23:17 by
mgehre-amd
Remove all2all backend envvar
documentation
ci/build
#30363 opened 2025-12-09 23:09 by
elizabetht
[WIP] Bump dockerfile to cuda 13.0.2 (for testing)
ci/build
nvidia
#30362 opened 2025-12-09 22:51 by
dougbtv
[Attention][AMD] Make flash-attn optional
rocm
speculative-decoding
v1
#30361 opened 2025-12-09 22:46 by
mgehre-amd
Upstream fp8 with static scales gpt oss
needs-rebase
gpt-oss
#30357 opened 2025-12-09 19:49 by
maleksan85
[CI][DeepSeek] Add nightly DeepSeek R1 `lm_eval` tests on H200
ready
ci/build
deepseek
#30356 opened 2025-12-09 18:05 by
MatthewBonanni
[Fix] Handle multiple tool calls in Qwen3-MTP tool parser
frontend
tool-calling
qwen
#30353 opened 2025-12-09 17:48 by
ArkVex
Remove virtual engine handling
tpu
needs-rebase
v1
codex
qwen
kv-connector
#30350 opened 2025-12-09 16:34 by
WoosukKwon
[Docs]: adds a new metric vllm:request_prefill_kv_computed_tokens in docs
documentation
#30348 opened 2025-12-09 15:24 by
googs1025
[Core] Major fix catch backend grammar exceptions (xgrammar, outlines, etc) in scheduler
v1
#30346 opened 2025-12-09 14:58 by
blancsw
[Bugfix] Fix HunyuanOCR cross-image contamination in batch processing
#30344 opened 2025-12-09 14:49 by
anker-c2
[CI] refine more logic when generating and using nightly wheels & indices
ci/build
#30341 opened 2025-12-09 14:17 by
Harry-Chen
Add Eagle and Eagle3 support to Transformers modeling backend
#30340 opened 2025-12-09 14:09 by
hmellor
Fix gigachat3 parser + update tests
frontend
tool-calling
#30338 opened 2025-12-09 13:37 by
ajpqs
fix: enhance human_readable_int function
#30337 opened 2025-12-09 13:28 by
andyxning
[Bugfix]: Streaming i/o of batch files. Resolves #30268
frontend
ci/build
#30334 opened 2025-12-09 11:32 by
umgefahren
[Bugfix] tpu_model_runner: set vllm config context when calling reset_dynamo_cache()
tpu
ready
v1
#30331 opened 2025-12-09 11:08 by
dtrifiro
[Feature][CPU Backend]: Add PyTorch vectorized backend
#30329 opened 2025-12-09 10:18 by
Radu2k
[Frontend] [Doc] Exclude log deltas feature
frontend
#30322 opened 2025-12-09 09:13 by
Catacomba
[BugFix] Spec decode with VLLM_ENABLE_V1_MULTIPROCESSING=0
v1
#30319 opened 2025-12-09 08:26 by
heheda12345
[Frontend] Allow users to modify the scheduler configuration online in dev mode.
frontend
v1
#30316 opened 2025-12-09 08:06 by
noooop
Generalize pooling model support with multi-task, multi-layer, multi-label classification that can be pooled from both hidden states and LM head's logits.
#30315 opened 2025-12-09 07:56 by
kflu
[fix] fix SM check for Flashinfer TRTLLM MOE
nvidia
#30314 opened 2025-12-09 07:12 by
jiahanc
Older