Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
woosuk/flashinfer-swa
7snzwi-codex/change-default-logging-behavior
acc-rate
amd_dev
amd_mori
amd-ci
andy-neuma-testing
apply-refactor-to-ct
batched_triton_fallback
bench-latency
benchmark_serving_test
bind_kv_caches
build-flashinfer-aot-wheel
codex/add-auto-max-model-length-setting
codex/add-pandas-and-datasets-to-requirements
codex/change-default-logging-behavior
codex/remove-raydistributedexecutor-from-v0-engine
codex/remove-virtual-engine-from-codebase
codex/remove-vllm-v0-engine-references-from-docs
codex/update-arch-overview-md-with-vllm-v1-details
copilot/fix-31e676e9-a4af-4ed2-b74d-19d27f0a57b2
copilot/fix-584be906-f283-4e17-8776-c14111357ee7
copilot/fix-56244f30-e76a-41ed-beaf-3bc9de22a2c9
copilot/fix-870996da-9146-438e-9a52-cdc6c1743086
copilot/fix-c6914add-1b66-46d0-9948-c2e7b6f2259f
copilot/fix-cudagraph-flag-combination
correct-docs-cuda-version
dbo-cudagraph-size-cherry
deep_full_cudagraph_fix
deepep_tweaks
deepseek_optimizations_alex_rob
dependabot/github_actions/actions/checkout-5.0.0
disable-sd
dockerfile-nvcc-compress
eplb_policy_log_fix
fix_ds_eagle
fix_use_ep
fix-aiter-mixtral
fix-doc-build
fix-flashinfer-experts-quant-config-hack
fix-hashing-partial-blocks
fix-precommit
fp8_ep_dp
full_cudagraph
gemma3n-mm
ghsa-mcmc-2m55-j8jj
gpu_ids2
gpu-ids
il_tool
jax-tpu
kevin_h100
khluu/clean_apt
khluu/nccl
khluu/sync_ci_1230
khluu/test_fixed_premerge
khluu/test_latest_feat
khluu/test_pull_through_cache
khluu/test_rebase
khluu/test_us_east_1
khluu/test
khluu/try_moc
khluu/use_ccache_premerge
khluu/0.11.1
khluu/8gpu_h200
khluu-patch-1
low_latency_opt
lwilkinson/cg-support
lwilkinson/dbo-full-cudagraphs
lwilkinson/eagle-piecewise
lwilkinson/potential-cutlass-mla-fix
lwilkinson/refactor-cmake
main
mamba_tests
marlin_gptoss_swiglu
maybe_fix_hang_2
mergify/houseroad/config-update
minus_x
mk-init-refactor-poc
mla_cuda_graphs
mla_decode_any_head
mla-support-awq-marlin
moondream2
optimize-prefix-caching-scheduling
pd_scheduling
pil_image
qwen25vl
rebased_fi_moe
reduce_scatter_comm
refactor-modelopt-fp8-modular-kernel
releases/v0.9.0
releases/v0.9.1
releases/v0.9.2
releases/v0.10.0
releases/v0.10.1
releases/v0.10.2
releases/v0.11.0
releases/v0.11.1
releases/v0.11.2
releases/v0.12.0
releases/v0.13.0
remove_mamba_ssm
revert-21550-chengji/fix-ci
revert-22299-main
revert-26740-wentao-optimize-startup-log-2
revert-27532-lwilkinson/upconvert-all-2
revert-27600-torch-utils-import
revert-29385-eplb_nightly_ci
running-deque
seemethere/cuda_arm64
simon-mo-patch-1
skip-lmfe-tests
split_kv_cache_init
support_global_dp_logging
test-debug-lb
test-docker-cache
tms/distributed_timeout
topk_id_hack
torch_dynamo
tpu_v1_optimized
tpu_v1
update_from_kv_xfer_finished_race_fix
use-uv-python-for-docker
v0.8.0
v0.8.1
v0.8.2
v0.8.3
v0.8.4
v0.8.5
v1-sched-interface-2
v1_fix_profiler
verbose-prime-rl-ci
wentao-fix-python-install-ci-error
wentao-fix-qwen3vl-launch-bug
wentao-fix-torch-compile-issue
wentao-revert-torch-warning
wentao-update-torch-to-2.9.1
whisper-translate
wide_ep_working_branch
wide_ep_working_branch_2
woosuk/fa3-swa-cudagraph
woosuk/flashinfer-swa
woosuk/remove-req-idx-mapping
woosuk/rm-add-init-env
woosuk/router-nixl
woosuk/sampled-token-ids
woosuk/test-router
woosuk/v2-logit-bias
woosuk/v2-penalties
woosuk-jf
wye-refactor-w8a8-quant
zhuohan/moe-kernel-experiment
zhuohan/remove-redundant-argument
zhuohan/remove-virtual-engine
zhuohan/revert-26709
opt
WoosukKwon
committed
141 days ago
2ad6985c
fix
WoosukKwon
committed
141 days ago
da03cb8f
[Optimization] Truncate kv page indices for sliding window attention
WoosukKwon
committed
141 days ago
90d43db4
[Log] Debug Once for Randomizing dummy data for DP Rank (#22860)
yewentao256
committed
141 days ago
Verified
df5afa82
[Model] Granite-4 support loading quantized checkpoint (#22925)
cyang49
committed
141 days ago
Verified
6cd69f51
[Kernels] Clean up FusedMoeMethodBase and modular kernel setup. Remove extra arguments from modular kernel methods. (#22035)
bnellnm
committed
141 days ago
Verified
8ad7285e
[Structured Output] Make the output of structured output example more complete (#22481)
shen-shanshan
committed
141 days ago
Verified
48b01fd4
[Benchmarks] Include image data when ShareGPT4V dataset is used. (#22955)
huachenheli
committed
141 days ago
Verified
993d3d12
[FIXBUG] Correctly Apply Grammar Bitmask in Mixed Batches (#22896)
JartX
committed
141 days ago
Verified
68af77e5
[BugFix] Skip the Q component for QKVParallelLinear in the case of QKVCrossParallelLinear since its width is 0 (#22369)
sstamenk
committed
141 days ago
Verified
6b04039a
[V0 Deprecation] Remove advance_step (#22969)
WoosukKwon
committed
141 days ago
Verified
1c859a13
[Core] Allow full cudagraph with separate attention routines and orthogonal to compilation, add support for FA2 and FlashInfer (#20059)
fhl2000
committed
141 days ago
Verified
74f441f4
[Frontend] Expose do_log_stats interval to env (#22905)
Csrayz
committed
141 days ago
Verified
a0632a3e
[CI] Remove duplicated docs build from buildkite (#22924)
hmellor
committed
141 days ago
Verified
e8b40c7f
[Misc] Ignore ep_kernels_workspace (#22807)
jeejeelee
committed
141 days ago
Verified
48f46369
[V1] [Hybrid] Support using float32 for state in Hybrid Models (Mamba2, Mamba1, Minimax) (#22928)
tdoublep
committed
141 days ago
Verified
75531a6c
Improve multimodal hasher performance for re-used Image prompts (#22825)
p88h
committed
141 days ago
Verified
22341b99
[MM] Allow skipping memory profiling for multimodal models. (#22950)
Roger Wang
committed
141 days ago
Verified
49252cf5
[Bugfix] fix cuda 12.6 and 11.8 build (#22952)
jinzhen-lin
committed
141 days ago
Verified
3e6dd400
[Bugfix] Unquote file uri before reading image (#22912)
sayandipdutta
committed
142 days ago
Verified
aa300c43
[V1] - Split Prefill and Decode for Mamba1 models (#22653)
amirai21
committed
142 days ago
Verified
fe91ce95
[CI] Pooling models mteb test uses enforce_eager (#22878)
noooop
committed
142 days ago
Verified
5406ebf5
[P/D]Provide bucket algorithm rate limiter for proxy_server (#22643)
frankie-ys
committed
142 days ago
Verified
b2c06509
Revert "[ROCm][AITER] Support AITER Rope ops in RotaryEmbedding Module." (#22956)
tjtanaa
committed
142 days ago
Verified
b2f6c247
[Mamba] - refactor: Renamed mamba_attn to mamba2_attn (#22818)
Josephasafg
committed
142 days ago
Verified
3d232dbd
[Feature] Full Cuda Graph Support for Cutlass MLA and 6% E2E Throughput Improvement (#22763)
yewentao256
committed
142 days ago
Verified
5c3fbfe4
refactor: Change scaling factors calculation for flashinfer FusedMoE (#22812)
amirkl94
committed
142 days ago
Verified
b4cef5e6
[CI Perf] Prune tests in `tests/kernels/attention/` (#22936)
mgoin
committed
142 days ago
Verified
0fe85087
[CI Perf] Prune tests in `tests/kernels/moe/` (#22939)
mgoin
committed
142 days ago
Verified
d2b0e97e
[CI Perf] Prune tests in `tests/kernels/quantization/` (#22942)
mgoin
committed
142 days ago
Verified
590bddbf
Older