Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
deepep_tweaks
7snzwi-codex/change-default-logging-behavior
acc-rate
amd_dev
amd_mori
amd-ci
andy-neuma-ibm-smoke
batched_triton_fallback
bench-latency
benchmark_serving_test
bind_kv_caches
build-flashinfer-aot-wheel
bump_numba
codex/add-auto-max-model-length-setting
codex/add-pandas-and-datasets-to-requirements
codex/change-default-logging-behavior
codex/remove-raydistributedexecutor-from-v0-engine
codex/remove-virtual-engine-from-codebase
codex/remove-vllm-v0-engine-references-from-docs
codex/update-arch-overview-md-with-vllm-v1-details
copilot/fix-31e676e9-a4af-4ed2-b74d-19d27f0a57b2
copilot/fix-584be906-f283-4e17-8776-c14111357ee7
copilot/fix-56244f30-e76a-41ed-beaf-3bc9de22a2c9
copilot/fix-870996da-9146-438e-9a52-cdc6c1743086
copilot/fix-c6914add-1b66-46d0-9948-c2e7b6f2259f
copilot/fix-cudagraph-flag-combination
correct-docs-cuda-version
dbo-cudagraph-size-cherry
deep_full_cudagraph_fix
deepep_tweaks
deepseek_optimizations_alex_rob
dependabot/github_actions/actions/checkout-5.0.0
dependabot/github_actions/actions/checkout-6.0.2
dependabot/github_actions/actions/setup-python-6.2.0
disable-sd
dockerfile-nvcc-compress
feat-k2.5-support
fix_ds_eagle
fix_use_ep
fix-aiter-mixtral
fix-doc-build
fix-hashing-partial-blocks
fix-precommit
fp8_ep_dp
full_cudagraph
gemma3n-mm
ghsa-mcmc-2m55-j8jj
gpu_ids2
gpu-ids
il_tool
integrate_aiter_batched_deepgemm
jax-tpu
khluu/disable_h200_x8
khluu/release_script_fix
khluu/test_ami
khluu-patch-1
low_latency_opt
lwilkinson/cg-support
lwilkinson/dbo-full-cudagraphs
lwilkinson/eagle-piecewise
lwilkinson/potential-cutlass-mla-fix
lwilkinson/refactor-cmake
main
mamba_tests
marlin_gptoss_swiglu
maybe_fix_hang_2
mergify/houseroad/config-update
minus_x
mk-init-refactor-poc
mla_cuda_graphs
mla_decode_any_head
mla-support-awq-marlin
moondream2
optimize-prefix-caching-scheduling
overlap-context-manager
overlap-workspace-fill-stream
pd_scheduling
pil_image
qwen25vl
rebased_fi_moe
reduce_scatter_comm
refactor-modelopt-fp8-modular-kernel
releases/v0.9.0
releases/v0.9.1
releases/v0.9.2
releases/v0.10.0
releases/v0.10.1
releases/v0.10.2
releases/v0.11.0
releases/v0.11.1
releases/v0.11.2
releases/v0.12.0
releases/v0.13.0
releases/v0.14.0
releases/v0.14.1
releases/v0.15.0
remove_mamba_ssm
remove-experts-int8
remove-petit-nvfp4
remove-ptpc-fp8
revert-21550-chengji/fix-ci
revert-22299-main
revert-25205-remote/serialize-inductor
revert-26740-wentao-optimize-startup-log-2
revert-27600-torch-utils-import
revert-29385-eplb_nightly_ci
rocm_silu_mul_quant
running-deque
seemethere/cuda_arm64
simon-mo-patch-1
skip-lmfe-tests
split_kv_cache_init
support_global_dp_logging
test-debug-lb
tms/distributed_timeout
topk_id_hack
torch_dynamo
tpu_v1_optimized
tpu_v1
update_from_kv_xfer_finished_race_fix
use-uv-python-for-docker
v0.8.0
v0.8.1
v0.8.2
v0.8.3
v0.8.4
v0.8.5
v1-sched-interface-2
v1_fix_profiler
verbose-prime-rl-ci
vllm-dashboard
wentao-enable-flashinfer-moe-fp4-by-default
wentao-fix-python-install-ci-error
wentao-fix-qwen3vl-launch-bug
wentao-fix-torch-compile-issue
wentao-optimize-detokenizer
wentao-prefer-sysmem-comm
wentao-update-torch-to-2.9.1
whisper-translate
wide_ep_working_branch
wide_ep_working_branch_2
wna16-modular-kernel
woosuk/flashinfer-swa
woosuk/remove-req-idx-mapping
woosuk/rm-add-init-env
woosuk/router-nixl
woosuk/sampled-token-ids
woosuk/test-router
woosuk-jf
wye-refactor-w8a8-quant
zhuohan/moe-kernel-experiment
zhuohan/remove-redundant-argument
zhuohan/remove-virtual-engine
zhuohan/revert-26709
add debug cruft
tlrmchlsmth
committed
222 days ago
fcec8c88
update
tlrmchlsmth
committed
222 days ago
850dafea
tolerances
tlrmchlsmth
committed
222 days ago
b4f17e12
fixup
tlrmchlsmth
committed
222 days ago
21ffc735
tweaks
tlrmchlsmth
committed
222 days ago
39d5d33f
precommit
tlrmchlsmth
committed
222 days ago
7a821f0e
fixes
tlrmchlsmth
committed
222 days ago
26fd8ca3
Unit test
tlrmchlsmth
committed
222 days ago
d5f20676
fixes - use-fp8-dispatch
Varun Sundar Rabindranath
committed
224 days ago
2b5ad9f2
DeepGEMM LL optimizations
tlrmchlsmth
committed
224 days ago
299f8291
Merge remote-tracking branch 'nm/varun/deepep-fp8-dispatch' into ll_deepgemm_opt
tlrmchlsmth
committed
224 days ago
104a984e
[Bugfix] fix RAY_CGRAPH_get_timeout is not set successfully (#19725)
chaunceyjiang
committed
224 days ago
Verified
12575cfa
[Hardware][AMD] integrate aiter chunked prefill into vllm (#18596)
Zzz9990
committed
224 days ago
Verified
8b6e1d63
deep_ep + use_fp8_dispatch
Varun Sundar Rabindranath
committed
224 days ago
8de2fd39
[Qwen] Add tagging rule for Qwen related PRs (#19799)
houseroad
committed
224 days ago
Verified
735a9de7
[Platform] Allow platform use V1 Engine by default (#19792)
wangxiyuan
committed
224 days ago
Verified
257ab954
[doc] fix the incorrect label (#19787)
reidliu41
committed
224 days ago
Verified
cca91a7a
[Minor] Zero-initialize attn output buffer (#19784)
WoosukKwon
committed
224 days ago
Verified
f04d6045
[V1] Decouple GPU and TPU `InputBatch` (#19778)
afeldman-nm
committed
224 days ago
Verified
19a53b27
[V1][P/D] An native implementation of xPyD based on P2P NCCL (#18242)
Abatom
committed
224 days ago
Verified
eccdc831
[V1] Add API docs for EncoderCacheManager (#19294)
russellb
committed
224 days ago
Verified
5f52a846
[Misc] Add __str__ for RequestStatus (#19780)
lk-chen
committed
224 days ago
Verified
d4629dc4
[MISC] correct DeviceConfig device field static type analysis (#19699)
andyxning
committed
224 days ago
Verified
6e9cc73f
[MISC] correct copy_blocks src_to_dists param type (#19696)
andyxning
committed
224 days ago
Verified
c53711bd
[TPU] Update torch version to include paged attention kernel change (#19706)
Chenyaaang
committed
225 days ago
Verified
dac8cc49
[Feature][ROCm] Add full graph capture support for TritonAttentionBackend (#19158)
charlifu
committed
225 days ago
Verified
a44b1c95
[Bugfix] Fix faulty triton importing logic when using Ray for DP (#19734)
mgoin
committed
225 days ago
Verified
b447624e
[Misc] Update lmcache connector with the latest connector apis (#19441)
YaoJiayi
committed
225 days ago
Verified
cda92307
Remove sm120 arch from sm100 cutlass kernel arch list (#19716)
mgoin
committed
225 days ago
Verified
bf57ccc5
[Perf] Optimize `moe_align_block_size` CUDA kernel (#19572)
yewentao256
committed
225 days ago
Verified
ffb2cd6b
Older