Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
deepep_tweaks
7snzwi-codex/change-default-logging-behavior
acc-rate
amd_dev
amd_mori
amd-ci
andy-neuma-testing
apply-refactor-to-ct
batched_triton_fallback
bench-latency
benchmark_serving_test
bind_kv_caches
build-flashinfer-aot-wheel
codex/add-auto-max-model-length-setting
codex/add-pandas-and-datasets-to-requirements
codex/change-default-logging-behavior
codex/remove-raydistributedexecutor-from-v0-engine
codex/remove-virtual-engine-from-codebase
codex/remove-vllm-v0-engine-references-from-docs
codex/update-arch-overview-md-with-vllm-v1-details
copilot/fix-31e676e9-a4af-4ed2-b74d-19d27f0a57b2
copilot/fix-584be906-f283-4e17-8776-c14111357ee7
copilot/fix-56244f30-e76a-41ed-beaf-3bc9de22a2c9
copilot/fix-870996da-9146-438e-9a52-cdc6c1743086
copilot/fix-c6914add-1b66-46d0-9948-c2e7b6f2259f
copilot/fix-cudagraph-flag-combination
correct-docs-cuda-version
dbo-cudagraph-size-cherry
deep_full_cudagraph_fix
deepep_tweaks
deepseek_optimizations_alex_rob
dependabot/github_actions/actions/checkout-5.0.0
disable-sd
dockerfile-nvcc-compress
eplb_policy_log_fix
fix_ds_eagle
fix_use_ep
fix-aiter-mixtral
fix-doc-build
fix-flashinfer-experts-quant-config-hack
fix-hashing-partial-blocks
fix-precommit
fp8_ep_dp
full_cudagraph
gemma3n-mm
ghsa-mcmc-2m55-j8jj
gpu_ids2
gpu-ids
il_tool
jax-tpu
kevin_h100
khluu/clean_apt
khluu/nccl
khluu/sync_ci_1230
khluu/test_fixed_premerge
khluu/test_latest_feat
khluu/test_pull_through_cache
khluu/test_rebase
khluu/test_us_east_1
khluu/test
khluu/try_moc
khluu/use_ccache_premerge
khluu/0.11.1
khluu/8gpu_h200
khluu-patch-1
low_latency_opt
lwilkinson/cg-support
lwilkinson/dbo-full-cudagraphs
lwilkinson/eagle-piecewise
lwilkinson/potential-cutlass-mla-fix
lwilkinson/refactor-cmake
main
mamba_tests
marlin_gptoss_swiglu
maybe_fix_hang_2
mergify/houseroad/config-update
minus_x
mk-init-refactor-poc
mla_cuda_graphs
mla_decode_any_head
mla-support-awq-marlin
moe-refactor-modelopt-fp8
moondream2
optimize-prefix-caching-scheduling
pd_scheduling
pil_image
qwen25vl
rebased_fi_moe
reduce_scatter_comm
refactor-modelopt-fp8-modular-kernel
releases/v0.9.0
releases/v0.9.1
releases/v0.9.2
releases/v0.10.0
releases/v0.10.1
releases/v0.10.2
releases/v0.11.0
releases/v0.11.1
releases/v0.11.2
releases/v0.12.0
releases/v0.13.0
remove_mamba_ssm
revert-21550-chengji/fix-ci
revert-22299-main
revert-26740-wentao-optimize-startup-log-2
revert-27532-lwilkinson/upconvert-all-2
revert-27600-torch-utils-import
revert-29385-eplb_nightly_ci
running-deque
seemethere/cuda_arm64
simon-mo-patch-1
skip-lmfe-tests
split_kv_cache_init
support_global_dp_logging
test-debug-lb
test-docker-cache
tms/distributed_timeout
topk_id_hack
torch_dynamo
tpu_v1_optimized
tpu_v1
update_from_kv_xfer_finished_race_fix
use-standard-prepare-finalize
use-uv-python-for-docker
v0.8.0
v0.8.1
v0.8.2
v0.8.3
v0.8.4
v0.8.5
v1-sched-interface-2
v1_fix_profiler
verbose-prime-rl-ci
wentao-fix-python-install-ci-error
wentao-fix-qwen3vl-launch-bug
wentao-fix-torch-compile-issue
wentao-revert-torch-warning
wentao-update-torch-to-2.9.1
whisper-translate
wide_ep_working_branch
wide_ep_working_branch_2
woosuk/fa3-swa-cudagraph
woosuk/flashinfer-swa
woosuk/remove-req-idx-mapping
woosuk/rm-add-init-env
woosuk/router-nixl
woosuk/sampled-token-ids
woosuk/test-router
woosuk/v2-logit-bias
woosuk/v2-penalties
woosuk-jf
wye-refactor-w8a8-quant
zhuohan/moe-kernel-experiment
zhuohan/remove-redundant-argument
zhuohan/remove-virtual-engine
zhuohan/revert-26709
add debug cruft
tlrmchlsmth
committed
196 days ago
fcec8c88
update
tlrmchlsmth
committed
196 days ago
850dafea
tolerances
tlrmchlsmth
committed
196 days ago
b4f17e12
fixup
tlrmchlsmth
committed
196 days ago
21ffc735
tweaks
tlrmchlsmth
committed
196 days ago
39d5d33f
precommit
tlrmchlsmth
committed
196 days ago
7a821f0e
fixes
tlrmchlsmth
committed
196 days ago
26fd8ca3
Unit test
tlrmchlsmth
committed
196 days ago
d5f20676
fixes - use-fp8-dispatch
Varun Sundar Rabindranath
committed
198 days ago
2b5ad9f2
DeepGEMM LL optimizations
tlrmchlsmth
committed
198 days ago
299f8291
Merge remote-tracking branch 'nm/varun/deepep-fp8-dispatch' into ll_deepgemm_opt
tlrmchlsmth
committed
198 days ago
104a984e
[Bugfix] fix RAY_CGRAPH_get_timeout is not set successfully (#19725)
chaunceyjiang
committed
198 days ago
Verified
12575cfa
[Hardware][AMD] integrate aiter chunked prefill into vllm (#18596)
Zzz9990
committed
198 days ago
Verified
8b6e1d63
deep_ep + use_fp8_dispatch
Varun Sundar Rabindranath
committed
198 days ago
8de2fd39
[Qwen] Add tagging rule for Qwen related PRs (#19799)
houseroad
committed
198 days ago
Verified
735a9de7
[Platform] Allow platform use V1 Engine by default (#19792)
wangxiyuan
committed
198 days ago
Verified
257ab954
[doc] fix the incorrect label (#19787)
reidliu41
committed
198 days ago
Verified
cca91a7a
[Minor] Zero-initialize attn output buffer (#19784)
WoosukKwon
committed
199 days ago
Verified
f04d6045
[V1] Decouple GPU and TPU `InputBatch` (#19778)
afeldman-nm
committed
199 days ago
Verified
19a53b27
[V1][P/D] An native implementation of xPyD based on P2P NCCL (#18242)
Abatom
committed
199 days ago
Verified
eccdc831
[V1] Add API docs for EncoderCacheManager (#19294)
russellb
committed
199 days ago
Verified
5f52a846
[Misc] Add __str__ for RequestStatus (#19780)
lk-chen
committed
199 days ago
Verified
d4629dc4
[MISC] correct DeviceConfig device field static type analysis (#19699)
andyxning
committed
199 days ago
Verified
6e9cc73f
[MISC] correct copy_blocks src_to_dists param type (#19696)
andyxning
committed
199 days ago
Verified
c53711bd
[TPU] Update torch version to include paged attention kernel change (#19706)
Chenyaaang
committed
199 days ago
Verified
dac8cc49
[Feature][ROCm] Add full graph capture support for TritonAttentionBackend (#19158)
charlifu
committed
199 days ago
Verified
a44b1c95
[Bugfix] Fix faulty triton importing logic when using Ray for DP (#19734)
mgoin
committed
199 days ago
Verified
b447624e
[Misc] Update lmcache connector with the latest connector apis (#19441)
YaoJiayi
committed
199 days ago
Verified
cda92307
Remove sm120 arch from sm100 cutlass kernel arch list (#19716)
mgoin
committed
199 days ago
Verified
bf57ccc5
[Perf] Optimize `moe_align_block_size` CUDA kernel (#19572)
yewentao256
committed
199 days ago
Verified
ffb2cd6b
Older