Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
lease-refresh
7snzwi-codex/change-default-logging-behavior
acc-rate
add-cuda-12.8-wheel
amd_dev
amd_mori
amd-ci
andy-neuma-ibm-smoke
batched_triton_fallback
bench-latency
benchmark_serving_test
bind_kv_caches
build-flashinfer-aot-wheel
builder-cuda-version
builder-nvcc-toolchain
bump_numba
codex/add-auto-max-model-length-setting
codex/add-pandas-and-datasets-to-requirements
codex/change-default-logging-behavior
codex/remove-raydistributedexecutor-from-v0-engine
codex/remove-vllm-v0-engine-references-from-docs
codex/update-arch-overview-md-with-vllm-v1-details
convert-deepseek-tests-to-b200
copilot/add-fusions-documentation-page
copilot/fix-31e676e9-a4af-4ed2-b74d-19d27f0a57b2
copilot/fix-584be906-f283-4e17-8776-c14111357ee7
copilot/fix-56244f30-e76a-41ed-beaf-3bc9de22a2c9
copilot/fix-870996da-9146-438e-9a52-cdc6c1743086
copilot/fix-c6914add-1b66-46d0-9948-c2e7b6f2259f
copilot/fix-cudagraph-flag-combination
copilot/rename-compile-ranges-split-points
correct-docs-cuda-version
cuda-toolchain-override
dbo-cudagraph-size-cherry
debug_fields_ignored
deep_full_cudagraph_fix
deepep_tweaks
deepseek_optimizations_alex_rob
dependabot/github_actions/actions/checkout-5.0.0
dependabot/github_actions/actions/checkout-6.0.2
dependabot/github_actions/actions/setup-python-6.2.0
dependabot/github_actions/actions/stale-10.2.0
disable-mxfp4a16-lm-eval
disable-sd
dockerfile-nvcc-compress
dockerignore_deps
downgrade-cuda-12.8
feat-k2.5-support
fix_ds_eagle
fix_fi_cutlass
fix_moe_test_flashinfer
fix_nixl_triton_attn
fix/rmsnorm-gated-activation
fix_use_ep
fix-aiter-mixtral
fix-dg-warmup
fix-doc-build
fix-dp-ep-shared-expert-monolithic
fix-hashing-partial-blocks
fix-mtp
fix-mtp-dummy-run-assertion
fix-precommit
fp8_ep_dp
full_cudagraph
gemma3n-mm
ghsa-mcmc-2m55-j8jj
gpu_ids2
gpu-ids
il_tool
integrate_aiter_batched_deepgemm
jax-tpu
khluu/disable_h200_x8
khluu/feb11
khluu/glm5
khluu/h200
khluu/releases/v0.16.0
khluu/test_ami
khluu/2/releases/v0.16.0
khluu-patch-1
lease-refresh
low_latency_opt
lwilkinson/cg-support
lwilkinson/dbo-full-cudagraphs
lwilkinson/eagle-piecewise
lwilkinson/fix-glm-5-mtp-more-then-1
lwilkinson/potential-cutlass-mla-fix
lwilkinson/refactor-cmake
main
mamba_tests
marlin_gptoss_swiglu
maybe_fix_hang_2
mergify/houseroad/config-update
minus_x
mk-init-refactor-poc
mla_cuda_graphs
mla_decode_any_head
mla-support-awq-marlin
moondream2
mrv2-ci-test
multi-api-server-frontend
optimize-prefix-caching-scheduling
overlap-context-manager
overlap-workspace-fill-stream
pcp-alt
pd_scheduling
pil_image
qwen3_5_fp8
qwen25vl
rebased_fi_moe
reduce_scatter_comm
refactor/migrate-prepare-finalize-to-subfolder
refactor-modelopt-fp8-modular-kernel
release
releases/v0.9.0
releases/v0.9.1
releases/v0.9.2
releases/v0.10.0
releases/v0.10.1
releases/v0.10.2
releases/v0.11.0
releases/v0.11.1
releases/v0.11.2
releases/v0.12.0
releases/v0.13.0
releases/v0.14.0
releases/v0.14.1
releases/v0.15.0
releases/v0.15.1
releases/v0.16.0
releases/v0.17.0
remove_mamba_ssm
remove_naive_all2all
remove-experts-int8
remove-petit-nvfp4
remove-ptpc-fp8
revert-21550-chengji/fix-ci
revert-22299-main
revert-25205-remote/serialize-inductor
revert-26740-wentao-optimize-startup-log-2
revert-27600-torch-utils-import
revert-29385-eplb_nightly_ci
revert-32344-moe-runner-0
rocm_silu_mul_quant
running-deque
seemethere/cuda_arm64
simon-mo-patch-1
skip-lmfe-tests
split_kv_cache_init
support_global_dp_logging
test-debug-lb
tms/distributed_timeout
topk_id_hack
torch_dynamo
tpu_v1_optimized
tpu_v1
update_from_kv_xfer_finished_race_fix
use-uv-python-for-docker
v0.8.0
v0.8.1
v0.8.2
v0.8.3
v0.8.4
v0.8.5
v0.16.0-before210
v0.16.0-cu128
v0.16.0-torch291
v1-sched-interface-2
v1_fix_profiler
verbose-prime-rl-ci
vllm-dashboard
wentao-compute-maxsim-worker
wentao-dcp-support-for-v2
wentao-enable-flashinfer-moe-fp4-by-default
wentao-fix-amd-ci-test-others-bug
wentao-fix-dcp-IMA-for-v2
wentao-fix-python-install-ci-error
wentao-fix-qwen3vl-launch-bug
wentao-fix-torch-compile-issue
wentao-optimize-async-scheduling-copy
wentao-optimize-model-runner-v2-prepare_inputs
wentao-optimize-model-runner-v2-sampler
wentao-optimize-sampled-token-ids
wentao-optimize-scheduler-overhead-for-PD-disaggregation
wentao-prefer-sysmem-comm
wentao-simplify-chat_completion_full_generator
wentao-sp-support-for-v2
wentao-update-torch-to-2.9.1
whisper-translate
wide_ep_working_branch
wide_ep_working_branch_2
wna16-modular-kernel
woosuk/flashinfer-swa
woosuk/mrv2-cudagraph-rework
woosuk/mrv2-execute-model-state
woosuk/mrv2-whisper
woosuk/mrv2-whisper-2
woosuk/remove-req-idx-mapping
woosuk/rm-add-init-env
woosuk/router-nixl
woosuk/test-router
woosuk-jf
wye-refactor-w8a8-quant
zhuohan/moe-kernel-experiment
zhuohan/redundant-pooling-check
zhuohan/remove-redundant-argument
zhuohan/remove-virtual-engine
zhuohan/revert-26709
revert spurious change
Robert Shaw
committed
2 days ago
275da3ca
refactor a bit
Robert Shaw
committed
2 days ago
a250ae33
update from nixl to internal
Robert Shaw
committed
2 days ago
4b554d19
humans are still needed to write code
Robert Shaw
committed
2 days ago
934224a2
[Feat][NIXL] Add KV lease refresh mechanism for disaggregated prefill
Robert Shaw
committed
2 days ago
38c00afb
[Docs] Add breadcrumbs for better UX (#35749)
hmellor
committed
2 days ago
Verified
7e9149d9
[MyPy][BugFix] Check profiler is assigned before calling start() on it (#35505)
hickeyma
committed
3 days ago
Verified
87c98b02
Fix unresolved-import errors when using Astral's ty by removing src.root (#35681)
tlrmchlsmth
committed
3 days ago
Verified
de7dd634
[Feat] Supports Anthropic Messages count_tokens API (#35588)
chaunceyjiang
committed
3 days ago
Verified
9a87b057
[Misc] Cleanup useless `current_platform` import (#35715)
wangxiyuan
committed
3 days ago
Verified
510bc9e1
[CPU][Distributed] Fix Enable _CPUSHMDistributed only when TP/PP ranks share the same SHM group name (#34169)
charlesashby
committed
3 days ago
Verified
cbd361fd
[Misc] Bound NIXL upper bound version (#35495)
NickLucche
committed
3 days ago
Verified
c212202d
[CI] Defining extended V1 e2e + engine tests (#35580)
AndreasKaratzas
committed
3 days ago
Verified
ec27b36b
[Rocm][CI] Fix LM Eval Large Models (H100) test group (#34750)
charlifu
committed
3 days ago
Verified
3fd1d4ec
[Kernel] Integrate SM100 MXFP8 blockscaled grouped MM and quant kernels (#34448)
EdalatiAli
committed
3 days ago
Verified
cb21972a
[ROCm][CI] Disable skinny GEMMs in language model standard tests to fix non-determinism (#35152)
AndreasKaratzas
committed
3 days ago
Verified
c34963f1
[ROCm] add amd-quark package in requirements for rocm to use quantized models (#35658)
hongxiayang
committed
3 days ago
Verified
f26650d6
[XPU] fix mxfp4 activation type (#35691)
jikunshang
committed
3 days ago
Verified
92f5d0f0
Fix deprecated v1 config tests (#35327)
jcaip
committed
3 days ago
Verified
a60985b0
[Attention] FA4 integration (#32974)
LucasWilkinson
committed
3 days ago
Verified
8b5014d3
Revert "[Bugfix] Disable TRTLLM attention with KV transfer enabled (#33192)" (#34832)
ZhanqiuHu
committed
3 days ago
Verified
57a96e26
[torch.compile] Undo the fast_moe_cold_start hack in torch>=2.11 (#35475)
zou3519
committed
3 days ago
Verified
e82fbeec
[Bugfix] Fix dtype mismatch in RMSNormGated.forward_native() during torch.compile (#35256)
haosdent
committed
3 days ago
Verified
62904708
[Model Runner V2] Use block table apis for capture inputs (#35671)
WoosukKwon
committed
3 days ago
Verified
72f4d162
fix(mxfp4): return is_monolithic=False when LoRA is enabled for Triton backend (#35382)
yoonsnowdev
committed
3 days ago
Verified
5a435507
[MISC] Fixing a null reference by removing parallel_utils from mypy EXCLUDE (#35630)
taneem-ibrahim
committed
3 days ago
Verified
59d7af9c
[Mamba1] - Kernel Level Chunk Alignment for Prefix Caching (#34798)
Josephasafg
committed
4 days ago
Verified
bbf81f9a
[Model Runner V2] Minor refactoring for EncoderRunner (#35628)
WoosukKwon
committed
4 days ago
Verified
da543d1a
[AMD][CI] Support Triton attention with ExampleConnector (#34931)
rjrock
committed
4 days ago
Verified
87d319c5
Fix typo: implictly -> implicitly in isaac.py docstring (#35646)
lin-shh
committed
4 days ago
Verified
a9ec392c
Older