Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
reduce_scatter_comm
7snzwi-codex/change-default-logging-behavior
acc-rate
amd_dev
amd_mori
amd-ci
andy-neuma-ibm-smoke
batched_triton_fallback
bench-latency
benchmark_serving_test
bind_kv_caches
build-flashinfer-aot-wheel
codex/add-auto-max-model-length-setting
codex/add-pandas-and-datasets-to-requirements
codex/change-default-logging-behavior
codex/remove-raydistributedexecutor-from-v0-engine
codex/remove-virtual-engine-from-codebase
codex/remove-vllm-v0-engine-references-from-docs
codex/update-arch-overview-md-with-vllm-v1-details
copilot/fix-31e676e9-a4af-4ed2-b74d-19d27f0a57b2
copilot/fix-584be906-f283-4e17-8776-c14111357ee7
copilot/fix-56244f30-e76a-41ed-beaf-3bc9de22a2c9
copilot/fix-870996da-9146-438e-9a52-cdc6c1743086
copilot/fix-c6914add-1b66-46d0-9948-c2e7b6f2259f
copilot/fix-cudagraph-flag-combination
correct-docs-cuda-version
dbo-cudagraph-size-cherry
deep_full_cudagraph_fix
deepep_tweaks
deepseek_optimizations_alex_rob
dependabot/github_actions/actions/checkout-5.0.0
disable-sd
dockerfile-nvcc-compress
fix_ds_eagle
fix_use_ep
fix-aiter-mixtral
fix-doc-build
fix-hashing-partial-blocks
fix-precommit
fp8_ep_dp
full_cudagraph
gemma3n-mm
ghsa-mcmc-2m55-j8jj
gpu_ids2
gpu-ids
il_tool
jax-tpu
khluu/sync_ci_1230
khluu/test_ami
low_latency_opt
lwilkinson/break-up-h200-tests
lwilkinson/cg-support
lwilkinson/dbo-full-cudagraphs
lwilkinson/eagle-piecewise
lwilkinson/potential-cutlass-mla-fix
lwilkinson/refactor-cmake
main
mamba_tests
marlin_gptoss_swiglu
maybe_fix_hang_2
mergify/houseroad/config-update
minus_x
mk-init-refactor-poc
mla_cuda_graphs
mla_decode_any_head
mla-support-awq-marlin
moondream2
naive-dispatch-combine
nm/lwilkinson/break-up-h200-tests
optimize-prefix-caching-scheduling
pd_scheduling
pil_image
qwen25vl
rebased_fi_moe
reduce_scatter_comm
refactor-modelopt-fp8-modular-kernel
releases/v0.9.0
releases/v0.9.1
releases/v0.9.2
releases/v0.10.0
releases/v0.10.1
releases/v0.10.2
releases/v0.11.0
releases/v0.11.1
releases/v0.11.2
releases/v0.12.0
releases/v0.13.0
remove_mamba_ssm
revert-21550-chengji/fix-ci
revert-22299-main
revert-26740-wentao-optimize-startup-log-2
revert-27532-lwilkinson/upconvert-all-2
revert-27600-torch-utils-import
revert-29385-eplb_nightly_ci
running-deque
seemethere/cuda_arm64
simon-mo-patch-1
skip-lmfe-tests
split_kv_cache_init
support_global_dp_logging
test-debug-lb
tms/distributed_timeout
topk_id_hack
torch_dynamo
tpu_v1_optimized
tpu_v1
update_from_kv_xfer_finished_race_fix
use-uv-python-for-docker
v0.8.0
v0.8.1
v0.8.2
v0.8.3
v0.8.4
v0.8.5
v1-sched-interface-2
v1_fix_profiler
verbose-prime-rl-ci
wentao-fix-python-install-ci-error
wentao-fix-qwen3vl-launch-bug
wentao-fix-torch-compile-issue
wentao-optimize-grouped-topk
wentao-remove-unused-func
wentao-update-torch-to-2.9.1
whisper-translate
wide_ep_working_branch
wide_ep_working_branch_2
woosuk/fa3-swa-cudagraph
woosuk/flashinfer-swa
woosuk/remove-req-idx-mapping
woosuk/rm-add-init-env
woosuk/router-nixl
woosuk/sampled-token-ids
woosuk/test-router
woosuk/v2-logit-bias
woosuk/v2-penalties
woosuk/v2-refactor
woosuk/v2-so-spec
woosuk-jf
wye-refactor-w8a8-quant
zhuohan/moe-kernel-experiment
zhuohan/remove-redundant-argument
zhuohan/remove-virtual-engine
zhuohan/revert-26709
Reduce Scatter Plumbing
tlrmchlsmth
committed
316 days ago
3679753a
[Bugfix] Initialize attention bias on the same device as Query/Key/Value for QwenVL Series (#14031)
LouieYang
committed
316 days ago
Verified
9b61dd41
[VLM][Bugfix] Enable specifying prompt target via index (#14038)
DarkLight1337
committed
316 days ago
Verified
f7bee5c8
[Bugfix] Fix MoeWNA16Method activation (#14024)
jeejeelee
committed
316 days ago
Verified
e0734387
Update AutoAWQ docs (#14042)
hmellor
committed
316 days ago
Verified
f58f8b5c
[V1][Minor] Restore V1 compatibility with LLMEngine class (#13090)
Ryp
committed
316 days ago
Verified
b3f7aacc
[Hardware][Intel-Gaudi] Regional compilation support (#13213)
Kacper-Pietkun
committed
316 days ago
Verified
b91660dd
Use smaller embedding model when not testing model specifically (#13891)
hmellor
committed
316 days ago
Verified
76c89fca
[Bugfix][Disaggregated] patch the inflight batching on the decode node in SimpleConnector to avoid hangs in SimpleBuffer (nccl based) (#13987)
hasB4K
committed
316 days ago
Verified
b9e41734
[Doc] Move multimodal Embedding API example to Online Serving page (#14017)
DarkLight1337
committed
316 days ago
Verified
1088f062
[Bugfix] Check that number of images matches number of <|image|> tokens with mllama (#13911)
tjohnson31415
committed
317 days ago
Verified
73e0225e
[V1]`SupportsV0Only` protocol for model definitions (#13959)
ywang96
committed
317 days ago
Verified
6c85da3a
[Misc] Print FusedMoE detail info (#13974)
jeejeelee
committed
317 days ago
Verified
67fc4268
[Model][Speculative Decoding] Expand DeepSeek MTP code to support k > n_predict (#13626)
benchislett
committed
317 days ago
Verified
9804145c
[Attention] Flash MLA for V1 (#13867)
LucasWilkinson
committed
317 days ago
Verified
2e94b9cf
[core] Perf improvement for DSv3 on AMD GPUs (#13718)
qli88
committed
317 days ago
Verified
8294773e
[V1][Minor] Minor cleanup for GPU Model Runner (#13983)
WoosukKwon
committed
317 days ago
Verified
cd813c6d
[ROCm] Fix the Kernels, Core, and Prefix Caching AMD CI groups (#13970)
SageMoore
committed
317 days ago
Verified
38acae6e
[VLM] Deprecate legacy input mapper for OOT multimodal models (#13979)
DarkLight1337
committed
317 days ago
Verified
a2dd48c3
Bump azure/setup-helm from 4.2.0 to 4.3.0 (#13742)
dependabot[bot]
committed
317 days ago
Verified
126f6bee
[Attention] MLA support for V1 (#13789)
Yang Chen
committed
317 days ago
Verified
58d1b2aa
[VLM] Generalized prompt updates for multi-modal processor (#13964)
DarkLight1337
committed
317 days ago
Verified
f1579b22
[Bugfix] Fix qwen2.5-vl overflow issue (#13968)
Isotr0py
committed
317 days ago
Verified
78648758
Update LMFE version to v0.10.11 to support new versions of transforme… (#13930)
noamgat
committed
317 days ago
Verified
1dd422b6
[bugfix] Fix profiling for RayDistributedExecutor (#13945)
ruisearch42
committed
317 days ago
Verified
06c8f8d8
Deduplicate `.pre-commit-config.yaml`'s `exclude` (#13967)
hmellor
committed
317 days ago
Verified
5677c9bb
Update quickstart.md (#13958)
observerw
committed
317 days ago
Verified
512d77d5
[Model] Deepseek GGUF support (#13167)
SzymonOzog
committed
317 days ago
Verified
7f0be2aa
[VLM] Support multimodal inputs for Florence-2 models (#13320)
Isotr0py
committed
317 days ago
Verified
edf309eb
Fix test_block_fp8.py test for MoE (#13915)
mgoin
committed
317 days ago
Verified
788f284b
Older