Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
revert-21550-chengji/fix-ci
7snzwi-codex/change-default-logging-behavior
acc-rate
add-nixl-transfer-time-logging
add-sgl-config
add-symm-mem-to-compile-cache
add-utils
amd_dev
amd_mori
amd-ci
andy-neuma-testing
avoid-double-free
batched_triton_fallback
bench-latency
benchmark_serving_test
benchmark
benchmark-output
bind_kv_caches
build-flashinfer-aot-wheel
codex/add-auto-max-model-length-setting
codex/add-pandas-and-datasets-to-requirements
codex/change-default-logging-behavior
codex/remove-raydistributedexecutor-from-v0-engine
codex/remove-virtual-engine-from-codebase
codex/remove-vllm-v0-engine-references-from-docs
codex/update-arch-overview-md-with-vllm-v1-details
compile-eplb
copilot/disable-batched-triton-kernel
copilot/fix-31e676e9-a4af-4ed2-b74d-19d27f0a57b2
copilot/fix-584be906-f283-4e17-8776-c14111357ee7
copilot/fix-56244f30-e76a-41ed-beaf-3bc9de22a2c9
copilot/fix-870996da-9146-438e-9a52-cdc6c1743086
copilot/fix-c6914add-1b66-46d0-9948-c2e7b6f2259f
copilot/fix-cudagraph-flag-combination
correct-docs-cuda-version
dbo-cudagraph-size-cherry
debug
debug-logging
debug-logs
deep_full_cudagraph_fix
deepep_tweaks
deepseek_optimizations_alex_rob
dependabot/github_actions/actions/checkout-5.0.0
disable-sd
dockerfile-nvcc-compress
dynamo-patch
fix_ds_eagle
fix_hang
fix_use_ep
fix-doc-build
fix-hashing-partial-blocks
fix-precommit
fix-v1-test
fp8_ep_dp
full_cudagraph
fused-moe-tuning-ep
gemma3n-mm
gpu_ids2
gpu-ids
il_tool
jax-tpu
kevin_h100
khluu/clean_apt
khluu/nccl
khluu/test_fixed_premerge
khluu/test_latest_feat
khluu/test_pull_through_cache
khluu/test_us_east_1
khluu/test
khluu/try_moc
khluu/use_ccache_premerge
khluu/0.11.1
low_latency_opt
lwilkinson/cg-support
lwilkinson/dbo-full-cudagraphs
lwilkinson/eagle-piecewise
lwilkinson/potential-cutlass-mla-fix
lwilkinson/refactor-cmake
main
mamba_tests
marlin_gptoss_swiglu
maybe_fix_hang_2
memory-leak-branch
mergify/houseroad/config-update
minus_x
mla_cuda_graphs
mla_decode_any_head
mla-support-awq-marlin
model-bash-tools
moondream2
nixl-debug-oh-fixed
nixl-upstreaming
optimize-prefix-caching-scheduling
pd_scheduling
pil_image
qwen25vl
rebased_fi_moe
reduce_scatter_comm
releases/v0.9.0
releases/v0.9.1
releases/v0.9.2
releases/v0.10.0
releases/v0.10.1
releases/v0.10.2
releases/v0.11.0
releases/v0.11.1
releases/v0.11.2
releases/v0.12.0
remove_mamba_ssm
remove-async-engine-tests
remove-metrics-and-tracing-test
remove-regression-test
revert-21550-chengji/fix-ci
revert-22299-main
revert-26740-wentao-optimize-startup-log-2
revert-27600-torch-utils-import
revert-29385-eplb_nightly_ci
rob-fixes
running-deque
sampler-env-variable
seemethere/cuda_arm64
simon-mo-patch-1
skip-lmfe-tests
skip-transformers-nightly
split_kv_cache_init
support_global_dp_logging
test-debug-lb
test-docker-cache
tms/distributed_timeout
topk_id_hack
torch_dynamo
torch-2.8
tpu_v1_optimized
tpu_v1
triton-configs
update_from_kv_xfer_finished_race_fix
use-uv-python-for-docker
v0.7.2-staging-branch
v0.8.0
v0.8.1
v0.8.2
v0.8.3
v0.8.4
v0.8.5
v1-sched-interface-2
v1_fix_profiler
verbose-prime-rl-ci
wentao-fix-python-install-ci-error
wentao-optimize-startup-logs-4
wentao-parallel_config-None-issue
whisper-translate
wide_ep_working_branch
wide_ep_working_branch_2
woosuk/fa3-swa-cudagraph
woosuk/flashinfer-swa
woosuk/remove-req-idx-mapping
woosuk/rm-add-init-env
woosuk/router-nixl
woosuk/sampled-token-ids
woosuk/test-router
woosuk/v2-logit-bias
woosuk/v2-penalties
woosuk-jf
wye-refactor-w8a8-quant
zhuohan/moe-kernel-experiment
zhuohan/remove-redundant-argument
zhuohan/remove-virtual-engine
zhuohan/revert-26709
Revert "[TPU][Bugfix] fix OOM issue in CI test (#21550)"
yaochengji
committed
137 days ago
Verified
c0a8db46
[TPU][Bugfix] fix OOM issue in CI test (#21550)
yaochengji
committed
137 days ago
Verified
40d86ee4
[Misc] Removed undefined cmake variables MOE_PERMUTE_ARCHS (#21262)
Yang Chen
committed
137 days ago
Verified
85d051f0
[CI/Build] fix cpu_extension for apple silicon (#21195)
ignaciosica
committed
137 days ago
Verified
5140f54b
[Misc][Tools] make max-model-len a parameter in auto_tune script (#21321)
yaochengji
committed
137 days ago
Verified
947edd09
[Model] Fix a check for None but the return value was empty list in Gemma3 MM vision_embeddings (#21479)
hfan
committed
137 days ago
Verified
fde60ee7
[Model] Support tensor parallel for timm ViT in Deepseek_vl2 (#21494)
wzqd
committed
137 days ago
Verified
b38bc652
[Bugfix] fix modelscope snapshot_download serialization (#21536)
andyxning
committed
137 days ago
Verified
adaf2c6d
[CI] Update CODEOWNERS for CPU and Intel GPU (#21582)
bigPYJ1151
committed
137 days ago
Verified
42343f1f
Integrate TensorSchema with shape validation for Phi3VImagePixelInputs (#21232)
bbeckca
committed
137 days ago
Verified
965bc71b
[Docs] Add `requirements/common.txt` to run unit tests (#21572)
zhouwfang
committed
137 days ago
Verified
807a328b
[TPU][Test] Temporarily suspend this MoE model in test_basic.py. (#21560)
QiliangCui
committed
137 days ago
Verified
e0be2c4d
[DP] Support api-server-count > 0 in hybrid DP LB mode (#21510)
njhill
committed
137 days ago
Verified
9c8b2c2a
[Bugfix] DeepGemm utils : Fix hardcoded type-cast (#21517)
varun-sundar-rabindranath
committed
137 days ago
Verified
2212cd6c
[Kernel] adding fused_moe configs for upcoming granite4 (#21332)
bringlein
committed
137 days ago
Verified
ce3a9b13
Fix GLM-4 PP Missing Layer When using with PP. (#21531)
zRzRzRzRzRzRzR
committed
137 days ago
Verified
2ce90e5b
[Bug] Fix DeepGemm Init Error (#21554)
yewentao256
committed
137 days ago
Verified
633f6e80
[Docs] Fix `site_url` for RunLLM (#21564)
hmellor
committed
137 days ago
Verified
b57296bb
[Frontend] `run-batch` supports V1 (#21541)
DarkLight1337
committed
137 days ago
Verified
34ddcf9f
[MoE] More balanced expert sharding (#21497)
WoosukKwon
committed
137 days ago
Verified
fe56180c
[TPU][TEST] HF_HUB_DISABLE_XET=1 the test 3. (#21539)
QiliangCui
committed
137 days ago
Verified
07d80d7b
update flashinfer to v0.2.9rc1 (#21485)
weireweire
committed
138 days ago
Verified
2dd72d23
[Docs] Add Expert Parallelism Initial Documentation (#21373)
simon-mo
committed
138 days ago
Verified
a6c7fb8c
[Docs][minor] Fix broken gh-file link in distributed serving docs (#21543)
crypdick
committed
138 days ago
Verified
a7272c23
[P/D] Support CPU Transfer in NixlConnector (#18293)
juncgu
committed
138 days ago
Verified
60662849
[P/D] Move FakeNixlWrapper to test dir (#21328)
ruisearch42
committed
138 days ago
Verified
1e9ea8e6
[XPU] Conditionally import CUDA-specific passes to avoid import errors on xpu platform (#21036)
chaojun-zhang
committed
138 days ago
Verified
d9f9a3fd
Update flashinfer CUTLASS MoE Kernel (#21408)
wenscarl
committed
138 days ago
Verified
1b25f1fe
[Bug] Fix Compressed Tensor NVFP4 `cutlass_fp4_group_mm` illegal memory access (#21465)
yewentao256
committed
138 days ago
Verified
e8cb0d04
[Docs] Rewrite Distributed Inference and Serving guide (#20593)
crypdick
committed
138 days ago
Verified
68417411
Older