Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
Open
Closed
[Misc] cleanup not use customer ops copy_blocks and copy_blocks_mla
#30967 opened 2025-12-18 14:02 by
lengrongfu
Migrate some old models to Transformers modelling backend
documentation
new-model
#30966 opened 2025-12-18 13:51 by
hmellor
[Perf][ROCm][AWQ] Improve performance of fused MoE GPTQ-AWQ and AWQ dequant kernels
rocm
needs-rebase
#30965 opened 2025-12-18 13:05 by
yuttian1
[Quantization] support logical_widths for fp8 marlin
#30962 opened 2025-12-18 12:36 by
jinzhen-lin
[Quantization] fix marlin w8a8 check
#30961 opened 2025-12-18 12:21 by
jinzhen-lin
Migrate from `mypy` to `ty`
ci/build
#30960 opened 2025-12-18 12:04 by
hmellor
[Feature]: Support NVIDIA ModelOpt HF FP8 variants FP8_PER_CHANNEL_PER_TOKEN and FP8_PB_WO in vLLM
documentation
frontend
nvidia
#30957 opened 2025-12-18 09:46 by
CedricHwong
Hotfix container build
documentation
ci/build
cpu
#30953 opened 2025-12-18 09:03 by
maryamtahhan
[Misc] Improve worker error messages for better debugging
v1
cpu
#30951 opened 2025-12-18 08:42 by
yurekami
[Metrics] Add Prometheus counters for Model FLOPs Utilization (MFU)
documentation
ready
v1
#30950 opened 2025-12-18 08:41 by
markmc
[Doc] Add Sophgo TPU Support
documentation
#30949 opened 2025-12-18 08:30 by
wzyrrr
fix: Suppress torch.frombuffer UserWarning for non-writable buffers
#30948 opened 2025-12-18 08:25 by
yurekami
[RFC][docs] Add lightweight AI assisted contribution policy
documentation
#30947 opened 2025-12-18 08:22 by
markmc
[Core] Improve DP synchronization error messages
v1
#30946 opened 2025-12-18 08:10 by
yurekami
[XPU] enable fp8 online streaming quantization
#30944 opened 2025-12-18 07:44 by
yma11
[Feature] Add --ssl-ciphers CLI argument for TLS cipher control
frontend
#30937 opened 2025-12-18 07:03 by
rickychen-infinirc
[v1][CP] Improve DCP/PCP/MTP error messages with actionable guidance
v1
#30936 opened 2025-12-18 06:59 by
yurekami
[XPU] allow custom workers (e.g. vllm-omni workers) to be used on XPU
#30935 opened 2025-12-18 06:56 by
faaany
enable vllm ut for Intel GPU
documentation
rocm
speculative-decoding
v1
nvidia
#30932 opened 2025-12-18 06:06 by
wincent8
[Multimodal] Add FIPS 140-3 compliant hashing support
multi-modality
#30925 opened 2025-12-18 05:07 by
yurekami
[BugFix] Fix TypeError: unhashable type: 'dict' when serving deepseek32
ready
v1
deepseek
#30924 opened 2025-12-18 04:16 by
LucasWilkinson
[Feature] Support using FlexKV as anothor KV Cache Offloading option.
documentation
needs-rebase
v1
kv-connector
#30917 opened 2025-12-18 02:35 by
axxx03
[BugFix] Fix spec decode + structured outputs + preemption edge case
bug
ready
v1
#30916 opened 2025-12-18 01:58 by
njhill
[Bug] Fix torch inductor issue
ready
qwen
#30914 opened 2025-12-18 00:35 by
yewentao256
[docker] install cuda13 version of lmcache and nixl
ci/build
kv-connector
nvidia
#30913 opened 2025-12-17 23:47 by
soodoshll
Fix/get raw stream patch #30905
#30912 opened 2025-12-17 23:39 by
baonudesifeizhai
Migrate activation kernels to libtorch stable ABI
ci/build
cpu
nvidia
#30908 opened 2025-12-17 22:49 by
mikaylagawarecki
[Bug] Fix batch invariant in torch 2.10
ready
#30907 opened 2025-12-17 22:20 by
yewentao256
[Feature]: support serving nvfp4 W4A16 moe models uisng Marlin
#30906 opened 2025-12-17 21:59 by
EdalatiAli
custom build backend
rocm
needs-rebase
ci/build
#30901 opened 2025-12-17 19:47 by
dtrifiro
Older