Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
overlap-context-manager
7snzwi-codex/change-default-logging-behavior
acc-rate
amd_dev
amd_mori
amd-ci
andy-neuma-ibm-smoke
batched_triton_fallback
bench-latency
benchmark_serving_test
bind_kv_caches
build-flashinfer-aot-wheel
bump_numba
codex/add-auto-max-model-length-setting
codex/add-pandas-and-datasets-to-requirements
codex/change-default-logging-behavior
codex/remove-raydistributedexecutor-from-v0-engine
codex/remove-virtual-engine-from-codebase
codex/remove-vllm-v0-engine-references-from-docs
codex/update-arch-overview-md-with-vllm-v1-details
copilot/fix-31e676e9-a4af-4ed2-b74d-19d27f0a57b2
copilot/fix-584be906-f283-4e17-8776-c14111357ee7
copilot/fix-56244f30-e76a-41ed-beaf-3bc9de22a2c9
copilot/fix-870996da-9146-438e-9a52-cdc6c1743086
copilot/fix-c6914add-1b66-46d0-9948-c2e7b6f2259f
copilot/fix-cudagraph-flag-combination
correct-docs-cuda-version
dbo-cudagraph-size-cherry
deep_full_cudagraph_fix
deepep_tweaks
deepseek_optimizations_alex_rob
dependabot/github_actions/actions/checkout-5.0.0
dependabot/github_actions/actions/checkout-6.0.2
dependabot/github_actions/actions/setup-python-6.2.0
disable-sd
dockerfile-nvcc-compress
feat-k2.5-support
fix_ds_eagle
fix_use_ep
fix-aiter-mixtral
fix-doc-build
fix-hashing-partial-blocks
fix-precommit
fp8_ep_dp
full_cudagraph
gemma3n-mm
ghsa-mcmc-2m55-j8jj
gpu_ids2
gpu-ids
il_tool
integrate_aiter_batched_deepgemm
jax-tpu
khluu/disable_h200_x8
khluu/release_script_fix
khluu/test_ami
khluu-patch-1
low_latency_opt
lwilkinson/cg-support
lwilkinson/dbo-full-cudagraphs
lwilkinson/eagle-piecewise
lwilkinson/potential-cutlass-mla-fix
lwilkinson/refactor-cmake
main
mamba_tests
marlin_gptoss_swiglu
maybe_fix_hang_2
mergify/houseroad/config-update
minus_x
mk-init-refactor-poc
mla_cuda_graphs
mla_decode_any_head
mla-support-awq-marlin
moondream2
optimize-prefix-caching-scheduling
overlap-context-manager
overlap-workspace-fill-stream
pd_scheduling
pil_image
qwen25vl
rebased_fi_moe
reduce_scatter_comm
refactor-modelopt-fp8-modular-kernel
releases/v0.9.0
releases/v0.9.1
releases/v0.9.2
releases/v0.10.0
releases/v0.10.1
releases/v0.10.2
releases/v0.11.0
releases/v0.11.1
releases/v0.11.2
releases/v0.12.0
releases/v0.13.0
releases/v0.14.0
releases/v0.14.1
releases/v0.15.0
remove_mamba_ssm
remove-experts-int8
remove-petit-nvfp4
remove-ptpc-fp8
revert-21550-chengji/fix-ci
revert-22299-main
revert-25205-remote/serialize-inductor
revert-26740-wentao-optimize-startup-log-2
revert-27600-torch-utils-import
revert-29385-eplb_nightly_ci
rocm_silu_mul_quant
running-deque
seemethere/cuda_arm64
simon-mo-patch-1
skip-lmfe-tests
split_kv_cache_init
support_global_dp_logging
test-debug-lb
tms/distributed_timeout
topk_id_hack
torch_dynamo
tpu_v1_optimized
tpu_v1
update_from_kv_xfer_finished_race_fix
use-uv-python-for-docker
v0.8.0
v0.8.1
v0.8.2
v0.8.3
v0.8.4
v0.8.5
v1-sched-interface-2
v1_fix_profiler
verbose-prime-rl-ci
vllm-dashboard
wentao-enable-flashinfer-moe-fp4-by-default
wentao-fix-python-install-ci-error
wentao-fix-qwen3vl-launch-bug
wentao-fix-torch-compile-issue
wentao-optimize-detokenizer
wentao-prefer-sysmem-comm
wentao-update-torch-to-2.9.1
whisper-translate
wide_ep_working_branch
wide_ep_working_branch_2
wna16-modular-kernel
woosuk/flashinfer-swa
woosuk/remove-req-idx-mapping
woosuk/rm-add-init-env
woosuk/router-nixl
woosuk/sampled-token-ids
woosuk/test-router
woosuk-jf
wye-refactor-w8a8-quant
zhuohan/moe-kernel-experiment
zhuohan/remove-redundant-argument
zhuohan/remove-virtual-engine
zhuohan/revert-26709
updated
Robert Shaw
committed
5 days ago
71a4a5b8
[CI][AMD][BugFix] Update wvSplitK (and other skinny_gemm wrappers) to ensure tensors passed will be made contiguous for the kernel (#32831)
rasmith
committed
5 days ago
Verified
6cc6d92b
[Bug] Fix benchmark script `moe_permute_unpermute` (#32949)
yewentao256
committed
5 days ago
Verified
dfab5f37
fix: Add glm4_moe_lite to MLA detection (#32614)
marksverdhei
committed
5 days ago
Verified
586a57ad
[cudagraphs] Refactor cudagraph capture loop (#32946)
LucasWilkinson
committed
5 days ago
Verified
3a414595
[Model Runner V2] Add KV Connector support (#32742)
njhill
committed
5 days ago
Verified
8518b304
[Bugfix][CI] Fix pre-commit (#32956)
MatthewBonanni
committed
5 days ago
Verified
2d6b5371
[CI][torch nightlies] Use main Dockerfile with flags for nightly torch tests (#30443)
orionr
committed
5 days ago
Verified
68b0a6c1
[V1][Hybrid] Mamba Prefix Caching with align mode (#30877)
peakcrosser7
committed
5 days ago
Verified
5206e5e2
[Model] Enable LoRA support for internvl2 (#32397)
MatteoFari
committed
5 days ago
Verified
fec9da0a
[torch.compile][CI] Add back attn fusion on hopper/ada (#32940)
ProExpertProg
committed
5 days ago
Verified
bbbd696a
[Frontend] add logprob, compression_rate to 'verbose_json' features (#31059)
sangbumlikeagod
committed
5 days ago
Verified
9b77bb79
[Hardware][AMD][CI][Bugfix] Fix Kernels Attention Cache test (#32904)
mawong-amd
committed
5 days ago
Verified
305e53ad
[ROCm][PD] Remove unused moriio connector proxy code (#32939)
markmc
committed
5 days ago
Verified
1cb4341f
[Bugfix] Fix FP8 MoE EP Weight Loading for ModelOpt Llama4 (#32886)
baonudesifeizhai
committed
5 days ago
Verified
1fb648bf
[Misc] Postpone torch_profiler deprecation (#32867)
NickLucche
committed
5 days ago
Verified
7e223097
[Bugfix] Disable tma_aligned_scales in test_fusions_e2e (#32916)
xyang16
committed
5 days ago
Verified
90c20079
[Bugfix] Fix getting vision features in Transformer Multimodal backend (#32933)
zucchini-nlp
committed
5 days ago
Verified
d95d6507
[Feature]: Remove DtoH Copy for lfm2_vl On Default Stream (#32815)
tianshu-Michael-yu
committed
5 days ago
Verified
13d8746c
[CPU][Feat] Update PyTorch to v2.10 for CPU Backend (#32869)
fadara01
committed
5 days ago
Verified
10e94c84
[Benchmark][Bugfix] Fix race condtion when starting server for sweep benchmark (#32927)
Isotr0py
committed
5 days ago
Verified
243e78c2
[CPU Backend][BugFix] Fix failing CPU MoE test (#32876)
fadara01
committed
5 days ago
Verified
aac0b817
[Frontend][3/n] Make pooling entrypoints request schema consensus | EmbedRequest & ClassifyRequest (#32905)
noooop
committed
5 days ago
Verified
05f3d714
[Voxtral] Add new streaming arch (#32861)
patrickvonplaten
committed
5 days ago
Verified
3f3f8952
[CI/Build][CPU] Fix failed pooling tests and macos smoke test (#32907)
bigPYJ1151
committed
5 days ago
Verified
5da4c7d7
[Misc] Add `get_name` to missing AttentionBackends (#32698)
NickLucche
committed
5 days ago
Verified
160c6fa3
[CI][Models] Add VLM Support for Sequence Classification Conversion (#32885)
AndreasKaratzas
committed
5 days ago
Verified
a8eb1182
[Bugfix] Fix _CPU_MOE_ACT AssertionError when vLLM config not set (#32777)
karanb192
committed
5 days ago
Verified
fa6e599a
[CI] Fix mypy for `vllm/v1/structured_output` (#32722)
yewentao256
committed
5 days ago
Verified
7ef58737
[torch.compile] Compile `CustomOp.forward_native` for `SiluAndMul` and `QuantFP8` to avoid raw torch ops inside opaque custom ops (#32806)
ProExpertProg
committed
5 days ago
Verified
5e4e0e51
Older