Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
deepep_tweaks
0.17.0take2
7snzwi-codex/change-default-logging-behavior
acc-rate
add-cuda-12.8-wheel
amd_dev
amd_mori
amd-ci
andy-neuma-ibm-smoke
as-of-2026-05-12
batched_triton_fallback
bench-latency
benchmark_serving_test
bind_kv_caches
bugfix/37931-nvfp4-batched-all2all
builder-cuda-version
builder-nvcc-toolchain
bump_numba
chinmay-amd-snapshot
ci/h200-35gb-entrypoints
ci/h200-35gb-queue-migration
ci/h200-35gb-remaining-20
ci/macos-arm-wheel
ci/narrow-basic-correctness-deps
ci/narrow-entrypoints-deps
ci/narrow-models-basic-deps
ci/narrow-models-language-deps
ci/narrow-models-multimodal-deps
ci/reorder-release-pipeline
claude/nervous-meitner
claude/optimize-weight-loading-7FlLd
claude/review-vllm-quantization-rfc-cGHDF
claude/slack-session-JTjDk
claude/zen-banach
codex/add-pandas-and-datasets-to-requirements
codex/change-default-logging-behavior
codex/remove-raydistributedexecutor-from-v0-engine
codex/remove-vllm-v0-engine-references-from-docs
codex/update-arch-overview-md-with-vllm-v1-details
codex/37931-flashinfer-cutedsl-batched-one-sided
compile-only-pr1
copilot/add-sp-min-token-to-e2e-tests
copilot/fix-31e676e9-a4af-4ed2-b74d-19d27f0a57b2
copilot/fix-584be906-f283-4e17-8776-c14111357ee7
copilot/fix-56244f30-e76a-41ed-beaf-3bc9de22a2c9
copilot/fix-870996da-9146-438e-9a52-cdc6c1743086
copilot/fix-c6914add-1b66-46d0-9948-c2e7b6f2259f
copilot/fix-cudagraph-flag-combination
copilot/update-test-conftest-to-use-moe-backend
coverage-test-cosmos3
coverage-test-jais
coverage-test-kv-triton
coverage-test-tilelang
cuda-toolchain-override
cursor/VLLM-94-usage-stats-v2-design-584f
cursor/main-branch-failure-triage-f8d5
cursor/test-quality-improvements-eeea
cutlass_fa3_mla_sparse
dbo-cudagraph-size-cherry
deep_full_cudagraph_fix
deepep_tweaks
deepep-v2-integration
deepseek_optimizations_alex_rob
dependabot/github_actions/actions/checkout-6.0.2
dependabot/github_actions/actions/github-script-9.0.0
dependabot/github_actions/actions/setup-python-6.2.0
dependabot/github_actions/actions/stale-10.2.0
dependabot/github_actions/actions/stale-10.3.0
dependabot/pip/fsspec-2026.4.0
dependabot/pip/minor-update-dfb852c60a
dependabot/pip/protobuf-7.34.1
dependabot/pip/pyrate-limiter-4.1.0
dependabot/pip/quack-kernels-gte-0.4.1
deprecate-timeout
dg-enable-pdl
disable-image-build-per-commit
disable-sd
dockerignore_deps
downgrade-cuda-12.8
dsv4-pd-fixes
feat/tokenspeed_mla_upstream
feat-k2.5-support
fix_ds_eagle
fix/eplb-balancedness-metric
fix/eplb-debug-logging
fix/eplb-nvfp4-modelopt
fix/eplb-prometheus-metrics
fix_fi_cutlass
fix/flashinfer-nvfp4-cross-row-scale-corruption
fix_moe_test_flashinfer
fix_nixl_get_finished_handshake_failure
fix_nixl_triton_attn
fix/rmsnorm-gated-activation
fix/topk-hash-indices-dtype
fix_use_ep
fix-doc-build
fix-hashing-partial-blocks
fix-mig-nvml-workaround
fix-mtp
fix-mtp-dummy-run-assertion
fix-nixl-dockerfile
fix-precommit
fp8_ep_dp
full_cudagraph
ganyi/dsv3.2_rocm_support
gb200-0317
gemma3n-mm
gemma4-fast-prefill
gemma4-mtp
ghsa-mcmc-2m55-j8jj
glm5-router
gpu_ids2
gpu-ids
il_tool
indexer_multistream
integrate_aiter_batched_deepgemm
jax-tpu
kernel-block-size-alignment-ssm
khluu/automate-release-dockerhub-push
khluu/b200-k8s-ci-smoke-20260429
khluu/b200-k8s-job-fixes
khluu/b200_k8s
khluu/build0405
khluu/cherrypick37322
khluu/disable_h200_x8
khluu/feb11
khluu/gemma2
khluu/gemma3
khluu/glm5
khluu/group_commands
khluu/h200
khluu/mig
khluu/mig-small-model-swaps
khluu/release-registry-cache
khluu/release-v0.20.1-uv-python
khluu/releases/v0.16.0
khluu/rocm_gemma
khluu/test_ami
khluu/trigger-perf-eval-nightly
khluu/vllm-base-uv-python
khluu/2/releases/v0.16.0
khluu/0190-540
khluu-patch-1
lora-test
low_latency_opt
luka/fix-rms-quant-non-contiguous
luka/vllm-ir/compile-op
luka/vllm-ir/rms-norm-batch-invariant
luka/vllm-ir/triton
luka/vllm-ir-nits
lwilkinson/cg-support
lwilkinson/dbo-full-cudagraphs
lwilkinson/eagle-piecewise
lwilkinson/fix-glm-5-mtp-more-then-1
lwilkinson/potential-cutlass-mla-fix
lwilkinson/refactor-cmake
main
mamba_tests
maybe_fix_hang_2
mergify/houseroad/config-update
mgoin/dgx-spark-smoke-test
migrate-gpu1-to-h200-18gb-remaining
migrate-safe-jobs-to-h200-mig
minus_x
mla_cuda_graphs
mla_decode_any_head
mnnvl_kv_transfer
moondream2
mrv2-ci-test
openai226
optimize-prefix-caching-scheduling
pcp-alt
pd_scheduling
pil_image
prometheus-cudagraph-pct
qwen3_5_fp8
qwen25vl
rebase-fa3-mla-sparse
rebased_fi_moe
redhat-h100-testing
reduce_scatter_comm
release
releases/v0.9.0
releases/v0.9.1
releases/v0.9.2
releases/v0.10.0
releases/v0.10.1
releases/v0.10.2
releases/v0.11.0
releases/v0.11.1
releases/v0.11.2
releases/v0.12.0
releases/v0.13.0
releases/v0.14.0
releases/v0.14.1
releases/v0.15.0
releases/v0.16.0
releases/v0.17.0
releases/v0.17.1
releases/v0.18.0
releases/v0.18.1
releases/v0.19.0
releases/v0.19.1
releases/v0.20.0
releases/v0.20.1-python-from-source
releases/v0.20.1
releases/v0.20.2
releases/v0.21.0
releases/v0.22.0
remove_mamba_ssm
remove-fp4-moe-env-var-clean
revert-21550-chengji/fix-ci
revert-22299-main
revert-26740-wentao-optimize-startup-log-2
revert-27600-torch-utils-import
revert-29385-eplb_nightly_ci
revert-40687-matthias.skinny-gemm-n5
revert-batch-kv-cache-swap-38460
revert-fa-sync
rocm_silu_mul_quant
running-deque
simon-mo-patch-1
skip-lmfe-tests
sm103
split_kv_cache_init
support_global_dp_logging
test-debug-lb
tms/distributed_timeout
tms/fix-nan
tms/nvfp4-nan-contamination-test
tokenspeed
topk_id_hack
torch_dynamo
tpu_v1_optimized
tpu_v1
update_from_kv_xfer_finished_race_fix
upgrade-cutedsl
upgrade-transformers-compressed-tensors
v0.8.0
v0.8.1
v0.8.2
v0.8.3
v0.8.4
v0.8.5
v0.16.0-before210
v0.16.0-cu128
v0.16.0-torch291
v1_fix_profiler
vadim/qwen35-no-deppgemm
vllm-dashboard
wentao-add-fast-all2all-kernel
wentao-add-reset-cache-for-v1
wentao-cache-is_sleep
wentao-dcp-support-for-v2
wentao-deprecate-embed&token_classify
wentao-enable-flashinfer-moe-fp4-by-default
wentao-epd-support-for-MRv2
wentao-fix-amd-ci-test-others-bug
wentao-fix-ci-batch-invariant-issue
wentao-fix-ci-destroy
wentao-fix-dcp-IMA-for-v2
wentao-fix-es-v2-bug
wentao-fix-flashinfer-layout
wentao-fix-python-install-ci-error
wentao-fix-qwen3vl-launch-bug
wentao-fix-torch-compile-issue
wentao-fix-v2-is_prefiliing
wentao-fix-v2-test_spec_decode_acceptance_length
wentao-model-runner-v2-support-stock-torch-compile
wentao-mrv2-migration-moe
wentao-mrv2-migration-more-dense
wentao-optimize-async-scheduling-copy
wentao-optimize-cutlassfp8
wentao-optimize-dcp-and-add-comm-func
wentao-optimize-model-runner-v2-prepare_inputs
wentao-optimize-model-runner-v2-sampler
wentao-optimize-per-token-group-quant
wentao-optimize-pooling-by-ragged-tensor
wentao-optimize-pooling-forward
wentao-optimize-shutdown-logs
wentao-optimize-spec-get-topk
wentao-prefer-sysmem-comm
wentao-refactor-batch-invariance-rms-norm
wentao-remove-dead-code
wentao-remove-dead-code-2
wentao-skip-work-when-empty
wentao-update-batch-invariant-docstring
wentao-update-torch-to-2.9.1
whisper-translate
wide_ep_working_branch
wide_ep_working_branch_2
woosuk/ds-exp
woosuk/ds-exp-2
woosuk/ds-exp-ag
woosuk/dsv4-mrv2-fix-claude
woosuk/fast-topk
woosuk/flashinfer-dcp
woosuk/kimi-exp
woosuk/mrv2-cudagraph-attn-fix
woosuk/mrv2-dsv4-mtp-fix
woosuk/mrv2-expert-indices
woosuk/router-nixl
woosuk/test-router
woosuk-jf
worktree-coverage-test-mapping
worktree-migrate-gpu1-to-h200-18gb-mig
wye-refactor-w8a8-quant
zhuohan/moe-kernel-experiment
zhuohan/redundant-pooling-check
zhuohan/remove-redundant-argument
zhuohan/remove-unnecessary-instance_id-setup
zhuohan/remove-virtual-engine
zhuohan/revert-26709
add debug cruft
tlrmchlsmth
committed
344 days ago
fcec8c88
update
tlrmchlsmth
committed
344 days ago
850dafea
tolerances
tlrmchlsmth
committed
344 days ago
b4f17e12
fixup
tlrmchlsmth
committed
344 days ago
21ffc735
tweaks
tlrmchlsmth
committed
344 days ago
39d5d33f
precommit
tlrmchlsmth
committed
344 days ago
7a821f0e
fixes
tlrmchlsmth
committed
344 days ago
26fd8ca3
Unit test
tlrmchlsmth
committed
344 days ago
d5f20676
fixes - use-fp8-dispatch
Varun Sundar Rabindranath
committed
346 days ago
2b5ad9f2
DeepGEMM LL optimizations
tlrmchlsmth
committed
346 days ago
299f8291
Merge remote-tracking branch 'nm/varun/deepep-fp8-dispatch' into ll_deepgemm_opt
tlrmchlsmth
committed
346 days ago
104a984e
[Bugfix] fix RAY_CGRAPH_get_timeout is not set successfully (#19725)
chaunceyjiang
committed
346 days ago
Verified
12575cfa
[Hardware][AMD] integrate aiter chunked prefill into vllm (#18596)
Zzz9990
committed
346 days ago
Verified
8b6e1d63
deep_ep + use_fp8_dispatch
Varun Sundar Rabindranath
committed
346 days ago
8de2fd39
[Qwen] Add tagging rule for Qwen related PRs (#19799)
houseroad
committed
346 days ago
Verified
735a9de7
[Platform] Allow platform use V1 Engine by default (#19792)
wangxiyuan
committed
346 days ago
Verified
257ab954
[doc] fix the incorrect label (#19787)
reidliu41
committed
347 days ago
Verified
cca91a7a
[Minor] Zero-initialize attn output buffer (#19784)
WoosukKwon
committed
347 days ago
Verified
f04d6045
[V1] Decouple GPU and TPU `InputBatch` (#19778)
afeldman-nm
committed
347 days ago
Verified
19a53b27
[V1][P/D] An native implementation of xPyD based on P2P NCCL (#18242)
Abatom
committed
347 days ago
Verified
eccdc831
[V1] Add API docs for EncoderCacheManager (#19294)
russellb
committed
347 days ago
Verified
5f52a846
[Misc] Add __str__ for RequestStatus (#19780)
lk-chen
committed
347 days ago
Verified
d4629dc4
[MISC] correct DeviceConfig device field static type analysis (#19699)
andyxning
committed
347 days ago
Verified
6e9cc73f
[MISC] correct copy_blocks src_to_dists param type (#19696)
andyxning
committed
347 days ago
Verified
c53711bd
[TPU] Update torch version to include paged attention kernel change (#19706)
Chenyaaang
committed
347 days ago
Verified
dac8cc49
[Feature][ROCm] Add full graph capture support for TritonAttentionBackend (#19158)
charlifu
committed
347 days ago
Verified
a44b1c95
[Bugfix] Fix faulty triton importing logic when using Ray for DP (#19734)
mgoin
committed
347 days ago
Verified
b447624e
[Misc] Update lmcache connector with the latest connector apis (#19441)
YaoJiayi
committed
347 days ago
Verified
cda92307
Remove sm120 arch from sm100 cutlass kernel arch list (#19716)
mgoin
committed
347 days ago
Verified
bf57ccc5
[Perf] Optimize `moe_align_block_size` CUDA kernel (#19572)
yewentao256
committed
347 days ago
Verified
ffb2cd6b
Older