Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
v1_fix_profiler
7snzwi-codex/change-default-logging-behavior
acc-rate
amd_dev
amd_mori
amd-ci
andy-neuma-ibm-smoke
batched_triton_fallback
bench-latency
benchmark_serving_test
bind_kv_caches
build-flashinfer-aot-wheel
codex/add-auto-max-model-length-setting
codex/add-pandas-and-datasets-to-requirements
codex/change-default-logging-behavior
codex/remove-raydistributedexecutor-from-v0-engine
codex/remove-virtual-engine-from-codebase
codex/remove-vllm-v0-engine-references-from-docs
codex/update-arch-overview-md-with-vllm-v1-details
copilot/fix-31e676e9-a4af-4ed2-b74d-19d27f0a57b2
copilot/fix-584be906-f283-4e17-8776-c14111357ee7
copilot/fix-56244f30-e76a-41ed-beaf-3bc9de22a2c9
copilot/fix-870996da-9146-438e-9a52-cdc6c1743086
copilot/fix-c6914add-1b66-46d0-9948-c2e7b6f2259f
copilot/fix-cudagraph-flag-combination
correct-docs-cuda-version
dbo-cudagraph-size-cherry
deep_full_cudagraph_fix
deepep_tweaks
deepseek_optimizations_alex_rob
dependabot/github_actions/actions/checkout-5.0.0
disable-sd
dockerfile-nvcc-compress
fix_ds_eagle
fix_use_ep
fix-aiter-mixtral
fix-doc-build
fix-hashing-partial-blocks
fix-precommit
fp8_ep_dp
full_cudagraph
gemma3n-mm
ghsa-mcmc-2m55-j8jj
gpu_ids2
gpu-ids
il_tool
jax-tpu
khluu/sync_ci_1230
khluu/test_ami
low_latency_opt
lwilkinson/break-up-h200-tests
lwilkinson/cg-support
lwilkinson/dbo-full-cudagraphs
lwilkinson/eagle-piecewise
lwilkinson/potential-cutlass-mla-fix
lwilkinson/refactor-cmake
main
mamba_tests
marlin_gptoss_swiglu
maybe_fix_hang_2
mergify/houseroad/config-update
minus_x
mk-init-refactor-poc
mla_cuda_graphs
mla_decode_any_head
mla-support-awq-marlin
moondream2
naive-dispatch-combine
nm/lwilkinson/break-up-h200-tests
optimize-prefix-caching-scheduling
oracle-part-a
pd_scheduling
pil_image
qwen25vl
rebased_fi_moe
reduce_scatter_comm
refactor-modelopt-fp8-modular-kernel
releases/v0.9.0
releases/v0.9.1
releases/v0.9.2
releases/v0.10.0
releases/v0.10.1
releases/v0.10.2
releases/v0.11.0
releases/v0.11.1
releases/v0.11.2
releases/v0.12.0
releases/v0.13.0
releases/v0.14.0
remove_mamba_ssm
revert-21550-chengji/fix-ci
revert-22299-main
revert-26740-wentao-optimize-startup-log-2
revert-27532-lwilkinson/upconvert-all-2
revert-27600-torch-utils-import
revert-29385-eplb_nightly_ci
running-deque
seemethere/cuda_arm64
simon-mo-patch-1
skip-lmfe-tests
split_kv_cache_init
support_global_dp_logging
test-debug-lb
tms/distributed_timeout
topk_id_hack
torch_dynamo
tpu_v1_optimized
tpu_v1
update_from_kv_xfer_finished_race_fix
use-uv-python-for-docker
v0.8.0
v0.8.1
v0.8.2
v0.8.3
v0.8.4
v0.8.5
v1-sched-interface-2
v1_fix_profiler
verbose-prime-rl-ci
vllm-dashboard
wentao-fix-python-install-ci-error
wentao-fix-qwen3vl-launch-bug
wentao-fix-torch-compile-issue
wentao-optimize-moe-permute
wentao-prefer-sysmem-comm
wentao-remove-unused-file2
wentao-remove-unused-func
wentao-update-torch-to-2.9.1
whisper-translate
wide_ep_working_branch
wide_ep_working_branch_2
woosuk/flashinfer-swa
woosuk/remove-req-idx-mapping
woosuk/rm-add-init-env
woosuk/router-nixl
woosuk/sampled-token-ids
woosuk/test-router
woosuk-jf
wye-refactor-w8a8-quant
zhuohan/moe-kernel-experiment
zhuohan/remove-redundant-argument
zhuohan/remove-virtual-engine
zhuohan/revert-26709
[V1] Fix profiling.py
alexm-redhat
committed
280 days ago
ccd21e19
[TPU][V1] Make `--disable_chunked_mm_input` mandatory for serving MM models (#16483)
NickLucche
committed
280 days ago
Verified
4d022cbc
Fix erroneous "model doesn't support compile" warning (#16486)
zou3519
committed
280 days ago
Verified
70de35a8
[Hardware][Intel-Gaudi] Multi-step scheduling implementation for HPU (#12779)
tzielinski-habana
committed
280 days ago
Verified
34b2cf3b
[Bugfix] Fix bugs of running Quark quantized models (#16236)
chaow-amd
committed
280 days ago
Verified
9e90c9f7
[Kernel] support merge_attn_states CUDA kernel, 3x speedup (#16173)
DefTruth
committed
280 days ago
Verified
e9528f6d
Don't install triton on `ppc64le` platform (#16470)
hmellor
committed
280 days ago
Verified
51baa9c3
[Misc] update api_client example (#16459)
reidliu41
committed
280 days ago
Verified
35e076b3
[Misc] Raise error for V1 not supporting Long LoRA. (#16415)
jeejeelee
committed
280 days ago
Verified
a26f59cc
Enforce valid max_num_batched_tokens when disable_chunked_mm_input=True (#16447)
mgoin
committed
280 days ago
Verified
aa3b3d76
[Core][LoRA][1/N] Add LoRA for EncoderDecoderModelRunner (#15990)
jeejeelee
committed
280 days ago
Verified
f7030df3
Revert "[Model] use AutoWeightsLoader for deepseek_v2, internlm2" (#16453)
DefTruth
committed
280 days ago
Verified
905e91e9
[Bugfix] Don't set an upper bound on repetition penalty (#16403)
alex-jw-brooks
committed
280 days ago
Verified
f8f9c0ba
[CPU][Bugfix] Fix CPU docker issues (#16454)
bigPYJ1151
committed
280 days ago
Verified
dda81102
[Bugfix][VLM] Fix failing Phi-4-MM multi-images tests and add vision-speech test (#16424)
Isotr0py
committed
280 days ago
Verified
93195146
Update supported_hardware.md for TPU INT8 (#16437)
mgoin
committed
280 days ago
Verified
ed375995
[Llama4] Enable attention temperature tuning by default for long context (>32k) (#16439)
sarckk
committed
280 days ago
Verified
99ef59cf
update benchmark_serving_structured_output to include auto backend (#16438)
Chenyaaang
committed
280 days ago
Verified
d544d141
check input length of sonnet samples (#16423)
alexey-belyakov
committed
280 days ago
Verified
3e397a94
Fix range_ratio Bug in RandomDataset (#16126)
jadewang21
committed
280 days ago
Verified
268c3250
[TPU][V1] Disable per-request seed/Generator (#16172)
NickLucche
committed
280 days ago
Verified
3cc9af88
[Bugfix] Fix output token length check logic (#16419)
eeslook
committed
281 days ago
Verified
7cd0bd72
[VLM] Avoid unnecessary dummy multimodal data during processing (#16416)
DarkLight1337
committed
281 days ago
Verified
56d4aefa
[V1] Zero-copy tensor/ndarray serialization/transmission (#13790)
njhill
committed
281 days ago
Verified
dd143ef5
[Model] Reduce redundant computations in mamba2 blocks for Bamba-9B (#15423)
cyang49
committed
281 days ago
Verified
daefed05
[Bugfix] Fix bug when dataset is json (#15899)
Chenyaaang
committed
281 days ago
Verified
5fbab20e
[V1][Spec Decode] Eagle Model loading (#16035)
LiuXiaoxuanPKU
committed
281 days ago
Verified
e8224f3d
[V1] Set structured output backend to `auto` by default (#15724)
russellb
committed
281 days ago
Verified
9665313c
Improve configs - `ParallelConfig` (#16332)
hmellor
committed
281 days ago
Verified
0c54fc72
[TPU][V1] Use `language_model` interface for getting text backbone in MM (#16410)
NickLucche
committed
281 days ago
Verified
c1b57855
Older