Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
test-debug-lb
0.17.0take2
7snzwi-codex/change-default-logging-behavior
acc-rate
add-cuda-12.8-wheel
amd_dev
amd_mori
amd-ci
andy-neuma-ibm-smoke
as-of-2026-05-12
as-of-2026-06-02
batched_triton_fallback
bench-latency
benchmark_serving_test
bind_kv_caches
bugfix/37931-nvfp4-batched-all2all
builder-cuda-version
builder-nvcc-toolchain
bump_numba
bz/bridge-rust-tool-parser-to-py
bz/deliberate-amphibian
bz/logprobs-newtype
bz/minimax-m3-engine-parser
bz/oss-harmony-renderer
bz/refactor-build-rust-setup
chinmay-amd-snapshot
ci/h200-35gb-entrypoints
ci/h200-35gb-queue-migration
ci/h200-35gb-remaining-20
ci/macos-arm-wheel
ci/narrow-basic-correctness-deps
ci/narrow-entrypoints-deps
ci/narrow-models-basic-deps
ci/narrow-models-language-deps
ci/narrow-models-multimodal-deps
ci/reorder-release-pipeline
claude/nervous-meitner
claude/optimize-weight-loading-7FlLd
claude/review-vllm-quantization-rfc-cGHDF
claude/slack-session-JTjDk
claude/zen-banach
codex/add-pandas-and-datasets-to-requirements
codex/change-default-logging-behavior
codex/remove-raydistributedexecutor-from-v0-engine
codex/remove-vllm-v0-engine-references-from-docs
codex/rocm-artifact-tensorizer
codex/update-arch-overview-md-with-vllm-v1-details
codex/37931-flashinfer-cutedsl-batched-one-sided
compile-only-pr1
copilot/add-sp-min-token-to-e2e-tests
copilot/fix-31e676e9-a4af-4ed2-b74d-19d27f0a57b2
copilot/fix-584be906-f283-4e17-8776-c14111357ee7
copilot/fix-56244f30-e76a-41ed-beaf-3bc9de22a2c9
copilot/fix-870996da-9146-438e-9a52-cdc6c1743086
copilot/fix-c6914add-1b66-46d0-9948-c2e7b6f2259f
copilot/fix-cudagraph-flag-combination
copilot/update-test-conftest-to-use-moe-backend
coverage-test-cosmos3
coverage-test-jais
coverage-test-kv-triton
coverage-test-tilelang
cuda-toolchain-override
cursor/VLLM-94-usage-stats-v2-design-584f
cursor/main-branch-failure-triage-f8d5
cursor/test-quality-improvements-eeea
cutlass_fa3_mla_sparse
dbo-cudagraph-size-cherry
deep_full_cudagraph_fix
deepep_tweaks
deepep-v2-integration
deepseek_optimizations_alex_rob
dependabot/github_actions/actions/checkout-6.0.2
dependabot/github_actions/actions/checkout-6.0.3
dependabot/github_actions/actions/checkout-7.0.0
dependabot/github_actions/actions/setup-python-6.2.0
dependabot/pip/fsspec-2026.4.0
dependabot/pip/minor-update-cb2a2b5c77
dependabot/pip/protobuf-7.34.1
dependabot/pip/pyrate-limiter-4.1.0
dependabot/pip/quack-kernels-gte-0.4.1
deprecate-timeout
disable-image-build-per-commit
disable-sd
dockerignore_deps
dsv4-pd-fixes
feat/tokenspeed_mla_upstream
feat-k2.5-support
fix/deepep-commit-hash
fix_ds_eagle
fix/eplb-balancedness-metric
fix/eplb-debug-logging
fix/eplb-nvfp4-modelopt
fix/eplb-prometheus-metrics
fix_fi_cutlass
fix/flashinfer-nvfp4-cross-row-scale-corruption
fix_moe_test_flashinfer
fix_nixl_get_finished_handshake_failure
fix_nixl_triton_attn
fix/topk-hash-indices-dtype
fix_use_ep
fix-doc-build
fix-hashing-partial-blocks
fix-mig-nvml-workaround
fix-mtp
fix-mtp-dummy-run-assertion
fix-nixl-dockerfile
fix-precommit
fp8_ep_dp
full_cudagraph
gb200-0317
gemma3n-mm
gemma4-mtp
ghsa-mcmc-2m55-j8jj
gpu_ids2
gpu-ids
il_tool
indexer_multistream
instanttensor-opt-in-improvements
integrate_aiter_batched_deepgemm
jax-tpu
kernel-block-size-alignment-ssm
khluu/automate-release-dockerhub-push
khluu/b200-k8s-ci-smoke-20260429
khluu/b200-k8s-job-fixes
khluu/b200_k8s
khluu/build0405
khluu/cherrypick37322
khluu/disable_h200_x8
khluu/feb11
khluu/gemma2
khluu/gemma3
khluu/glm5
khluu/group_commands
khluu/h200
khluu/mig
khluu/mig-small-model-swaps
khluu/release-registry-cache
khluu/release-v0.20.1-uv-python
khluu/releases/v0.16.0
khluu/rocm_gemma
khluu/test_ami
khluu/trigger-perf-eval-nightly
khluu/vllm-base-uv-python
khluu/2/releases/v0.16.0
khluu/0190-540
khluu-patch-1
kv-content-pack
lora-test
low_latency_opt
luka/fix-rms-quant-non-contiguous
luka/vllm-ir/compile-op
luka/vllm-ir/rms-norm-batch-invariant
luka/vllm-ir/triton
luka/vllm-ir-nits
lwilkinson/kv-layout/bind-kv-cache
lwilkinson/kv-layout/core-standardize
lwilkinson/kv-layout/kv-content-pack
main
mamba_tests
manual-act-quant-fusion-llama
maybe_fix_hang_2
mergify/houseroad/config-update
migrate-gpu1-to-h200-18gb-remaining
migrate-safe-jobs-to-h200-mig
minimax-m3-perf
minus_x
mla_cuda_graphs
mla_decode_any_head
mnnvl_kv_transfer
moondream2
mrv2-ci-test
nvls-ag-rs-rebased
openai226
optimize-prefix-caching-scheduling
pcp-alt
pd_scheduling
perf/push-allreduce-2buffer
pil_image
pr/44455
pr-44891
prometheus-cudagraph-pct
qwen3_5_fp8
qwen25vl
rebase-fa3-mla-sparse
rebased_fi_moe
redhat-h100-testing
reduce_scatter_comm
release
releases/v0.9.0
releases/v0.9.1
releases/v0.9.2
releases/v0.10.0
releases/v0.10.1
releases/v0.10.2
releases/v0.11.0
releases/v0.11.1
releases/v0.11.2
releases/v0.12.0
releases/v0.13.0
releases/v0.14.0
releases/v0.14.1
releases/v0.15.0
releases/v0.16.0
releases/v0.17.0
releases/v0.17.1
releases/v0.18.0
releases/v0.18.1
releases/v0.19.0
releases/v0.19.1
releases/v0.20.0
releases/v0.20.1-python-from-source
releases/v0.20.1
releases/v0.20.2
releases/v0.21.0
releases/v0.22.0
releases/v0.22.1
releases/v0.23.0
releases/v0.24.0
remove_mamba_ssm
remove-fp4-moe-env-var-clean
revert-21550-chengji/fix-ci
revert-22299-main
revert-26740-wentao-optimize-startup-log-2
revert-27600-torch-utils-import
revert-40687-matthias.skinny-gemm-n5
revert-batch-kv-cache-swap-38460
rocm_silu_mul_quant
running-deque
simon-mo-patch-1
skip-lmfe-tests
sm103
speed-up-sp-tests
speed-up-sp-tests-v2
split_kv_cache_init
support_global_dp_logging
test-debug-lb
tms/disable-dynamic-sd-dp
tms/distributed_timeout
tms/fix-nan
tms/nvfp4-nan-contamination-test
tokenspeed
topk_id_hack
torch_dynamo
tpu_v1_optimized
tpu_v1
ucx_oneshot_ar
update_from_kv_xfer_finished_race_fix
upgrade-cutedsl
upgrade-transformers-compressed-tensors
v0.8.0
v0.8.1
v0.8.2
v0.8.3
v0.8.4
v0.8.5
v0.16.0-before210
v0.16.0-cu128
v0.16.0-torch291
v1_fix_profiler
vadim/qwen35-no-deppgemm
vllm-dashboard
wentao-cache-is_sleep
wentao-dcp-support-for-v2
wentao-deprecate-embed&token_classify
wentao-dp-supervisor-rust
wentao-enable-all-dense-for-mrv2
wentao-enable-flashinfer-moe-fp4-by-default
wentao-epd-support-for-MRv2
wentao-fix-amd-ci-test-others-bug
wentao-fix-ci-batch-invariant-issue
wentao-fix-ci-destroy
wentao-fix-dcp-IMA-for-v2
wentao-fix-es-v2-bug
wentao-fix-flashinfer-layout
wentao-fix-python-install-ci-error
wentao-fix-qwen3vl-launch-bug
wentao-fix-torch-compile-issue
wentao-fix-v2-is_prefiliing
wentao-fix-v2-mrope
wentao-fix-v2-test_spec_decode_acceptance_length
wentao-model-runner-v2-support-stock-torch-compile
wentao-mrv2-enable-all-moe
wentao-mrv2-migration-more-dense
wentao-optimize-async-scheduling-copy
wentao-optimize-dcp-and-add-comm-func
wentao-optimize-model-runner-v2-prepare_inputs
wentao-optimize-model-runner-v2-sampler
wentao-optimize-per-token-group-quant
wentao-optimize-pooling-by-ragged-tensor
wentao-optimize-pooling-forward
wentao-prefer-sysmem-comm
wentao-remove-dead-code
wentao-remove-dead-minimax_allreduce_rms
wentao-skip-work-when-empty
wentao-update-batch-invariant-docstring
wentao-update-torch-to-2.9.1
whisper-translate
wide_ep_working_branch
wide_ep_working_branch_2
woosuk/ds-exp
woosuk/ds-exp-2
woosuk/ds-exp-ag
woosuk/dsv4-sp
woosuk/fast-topk
woosuk/flashinfer-dcp
woosuk/kimi-exp
woosuk/mrv2-cudagraph-attn-fix
woosuk/mrv2-expert-indices
woosuk/router-nixl
woosuk/test-router
woosuk/triton-fix
woosuk-jf
worktree-agent-ae80686f16ccbc350
worktree-coverage-test-mapping
worktree-fix-ci-export-subshell
worktree-fix-cudagraph-flaky
worktree-gsm8k-offloading-test
worktree-migrate-gpu1-to-h200-18gb-mig
wye-refactor-w8a8-quant
zhuohan/moe-kernel-experiment
zhuohan/redundant-pooling-check
zhuohan/remove-redundant-argument
zhuohan/remove-unnecessary-instance_id-setup
zhuohan/remove-virtual-engine
zhuohan/revert-26709
updated
Robert Shaw
committed
339 days ago
d0d68a4c
relax hybrid dp asserts
tlrmchlsmth
committed
339 days ago
35f3782d
Merge remote-tracking branch 'origin/main' into one-pod-per-node-lb
njhill
committed
340 days ago
5fb68091
[Tests] Add tests for headless internal DP LB (#21450)
njhill
committed
340 days ago
Verified
316b1bf7
rename test
njhill
committed
340 days ago
fb0cf7e2
fix internal_dp_lb tests
njhill
committed
340 days ago
1c300fcf
[Bugfix][Qwen][DCA] fixes bug in dual-chunk-flash-attn backend for qwen 1m models. (#21364)
sighingnow
committed
340 days ago
Verified
7c734ee0
[V1] Check all pooling tasks during profiling (#21299)
DarkLight1337
committed
340 days ago
Verified
f59ec35b
CI tests for hybrid DPLB mode
njhill
committed
340 days ago
6328c808
[Tests] Add tests for headless internal DP LB
njhill
committed
340 days ago
d95aedd5
fix bad merge
njhill
committed
340 days ago
8601a22d
[Model] add Hunyuan V1 Dense Model support. (#21368)
Asher
committed
340 days ago
Verified
2671334d
[Docs] Clean up v1/metrics.md (#21449)
windsonsea
committed
340 days ago
Verified
2cc5016a
Merge remote-tracking branch 'refs/remotes/origin/main' into one-pod-per-node-lb
njhill
committed
340 days ago
1bd5f2f1
[Misc] fixed nvfp4_moe test failures due to invalid kwargs (#21246)
Yang Chen
committed
340 days ago
Verified
6929f8b4
Mamba V2 Test not Asserting Failures. (#21379)
fabianlim
committed
340 days ago
Verified
32ec9e2f
[Sampler] Introduce logprobs mode for logging (#21398)
houseroad
committed
340 days ago
Verified
accac829
[Docs] Fix bullets and grammars in tool_calling.md (#21440)
windsonsea
committed
340 days ago
Verified
23637dcd
Fixed typo in profiling logs (#21441)
sergiopaniego
committed
340 days ago
Verified
6364af92
[Bugfix] ensure tool_choice is popped when `tool_choice:null` is passed in json payload (#19679)
gcalmettes
committed
340 days ago
Verified
7aaa2bd5
fix handshake mock test
njhill
committed
340 days ago
f63cc192
add clear messages for deprecated models (#21424)
youkaichao
committed
340 days ago
Verified
2f5c14de
[Cleanup] Only log MoE DP setup warning if DP is enabled (#21315)
mgoin
committed
340 days ago
Verified
f002e9a8
[Core] Add basic unit test for maybe_evict_cached_block (#21400)
Jialin
committed
340 days ago
Verified
a1f3610f
[Bugfix] Fix nightly transformers CI failure (#21427)
Isotr0py
committed
340 days ago
Verified
4ecedd18
Changing "amdproduction" allocation. (#21409)
Alexei-V-Ivanov-AMD
committed
340 days ago
Verified
107111a8
[Bugfix][CUDA] fixes CUDA FP8 kv cache dtype supported (#21420)
elvischenv
committed
340 days ago
Verified
2dec7c1a
[BUGFIX] deepseek-v2-lite failed due to fused_qkv_a_proj name update (#21414)
xuechendi
committed
340 days ago
Verified
08d2bd78
[BugFix] Update python to python3 calls for image; fix prefix & input calculations. (#21391)
ericehanley
committed
340 days ago
Verified
4f76a05f
Simplify weight loading in Transformers backend (#21382)
hmellor
committed
340 days ago
Verified
f154bb9f
Older