Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
vllm-project/vllm
Pull Requests
Commits
deprecate-quantization-schemes
7snzwi-codex/change-default-logging-behavior
acc-rate
amd_dev
amd_mori
amd-ci
andy-neuma-testing
apply-refactor-to-ct
batched_triton_fallback
bench-latency
benchmark_serving_test
bind_kv_caches
build-flashinfer-aot-wheel
codex/add-auto-max-model-length-setting
codex/add-pandas-and-datasets-to-requirements
codex/change-default-logging-behavior
codex/remove-raydistributedexecutor-from-v0-engine
codex/remove-virtual-engine-from-codebase
codex/remove-vllm-v0-engine-references-from-docs
codex/update-arch-overview-md-with-vllm-v1-details
copilot/fix-31e676e9-a4af-4ed2-b74d-19d27f0a57b2
copilot/fix-584be906-f283-4e17-8776-c14111357ee7
copilot/fix-56244f30-e76a-41ed-beaf-3bc9de22a2c9
copilot/fix-870996da-9146-438e-9a52-cdc6c1743086
copilot/fix-c6914add-1b66-46d0-9948-c2e7b6f2259f
copilot/fix-cudagraph-flag-combination
correct-docs-cuda-version
dbo-cudagraph-size-cherry
deep_full_cudagraph_fix
deepep_tweaks
deepseek_optimizations_alex_rob
dependabot/github_actions/actions/checkout-5.0.0
deprecate-quantization-schemes
disable-sd
dockerfile-nvcc-compress
fix_ds_eagle
fix_use_ep
fix-aiter-mixtral
fix-doc-build
fix-hashing-partial-blocks
fix-precommit
fp8_ep_dp
full_cudagraph
gemma3n-mm
ghsa-mcmc-2m55-j8jj
gpu_ids2
gpu-ids
il_tool
jax-tpu
khluu/test_ami
low_latency_opt
lwilkinson/cg-support
lwilkinson/dbo-full-cudagraphs
lwilkinson/eagle-piecewise
lwilkinson/potential-cutlass-mla-fix
lwilkinson/refactor-cmake
main
mamba_tests
marlin_gptoss_swiglu
maybe_fix_hang_2
mergify/houseroad/config-update
minus_x
mk-init-refactor-poc
mla_cuda_graphs
mla_decode_any_head
mla-support-awq-marlin
moondream2
nvfp4-refactor
optimize-prefix-caching-scheduling
pd_scheduling
pil_image
qwen25vl
rebased_fi_moe
reduce_scatter_comm
refactor-modelopt-fp8-modular-kernel
releases/v0.9.0
releases/v0.9.1
releases/v0.9.2
releases/v0.10.0
releases/v0.10.1
releases/v0.10.2
releases/v0.11.0
releases/v0.11.1
releases/v0.11.2
releases/v0.12.0
remove_mamba_ssm
revert-21550-chengji/fix-ci
revert-22299-main
revert-26740-wentao-optimize-startup-log-2
revert-27532-lwilkinson/upconvert-all-2
revert-27600-torch-utils-import
revert-29385-eplb_nightly_ci
running-deque
seemethere/cuda_arm64
simon-mo-patch-1
skip-lmfe-tests
split_kv_cache_init
support_global_dp_logging
test-debug-lb
tms/distributed_timeout
topk_id_hack
torch_dynamo
tpu_v1_optimized
tpu_v1
update_from_kv_xfer_finished_race_fix
use-uv-python-for-docker
v0.8.0
v0.8.1
v0.8.2
v0.8.3
v0.8.4
v0.8.5
v1-sched-interface-2
v1_fix_profiler
verbose-prime-rl-ci
wentao-fix-python-install-ci-error
wentao-fix-qwen3vl-launch-bug
wentao-fix-torch-compile-issue
wentao-optimize-cutlass-moe
wentao-update-torch-to-2.9.1
whisper-translate
wide_ep_working_branch
wide_ep_working_branch_2
woosuk/fa3-swa-cudagraph
woosuk/flashinfer-swa
woosuk/remove-req-idx-mapping
woosuk/rm-add-init-env
woosuk/router-nixl
woosuk/sampled-token-ids
woosuk/test-router
woosuk/v2-logit-bias
woosuk/v2-penalties
woosuk-jf
wye-refactor-w8a8-quant
zhuohan/moe-kernel-experiment
zhuohan/remove-redundant-argument
zhuohan/remove-virtual-engine
zhuohan/revert-26709
Update vllm/config/model.py
robertgshaw2-redhat
committed
1 day ago
Verified
6de6e95e
fix failing quantizaton test
Robert Shaw
committed
2 days ago
b39074e4
updated
Robert Shaw
committed
2 days ago
2f1daa86
updated
Robert Shaw
committed
2 days ago
7bb15cda
updated list of schemes
Robert Shaw
committed
2 days ago
0afd11d9
updated
Robert Shaw
committed
2 days ago
522aff03
initial commit
Robert Shaw
committed
2 days ago
430670c5
[BugFix] Async scheduling: handle model forward errors more cleanly (#31611)
njhill
committed
2 days ago
Verified
b53b89fd
[misc] Sort uvicorn log level description according to verbosity (#31137)
andyxning
committed
2 days ago
Verified
6522721d
fix no think of GLM-4.5 / GLM-4.7 (#31449)
zRzRzRzRzRzRzR
committed
3 days ago
Verified
0d4044ed
[Docs] Fix argparse include path for mm-processor benchmark (#31654)
reaganjlee
committed
3 days ago
Verified
41ab1797
[MoE Refactor][13/N] Convert FI to Use PFNoEP (#31533)
robertgshaw2-redhat
committed
3 days ago
Verified
268b1c55
[CI][Bugfix] Fix token counting in chunked prefill compl test (#31630)
AndreasKaratzas
committed
4 days ago
Verified
4f9ce35a
Improve HF qwen3_omni: preserve audio_sample_rate in kwargs restructuring (#29255)
jeremyteboul
committed
4 days ago
Verified
97a01308
[Core] Parse vLLM engine required fields from hf_config to model_arch_config (#28454)
charlotte12l
committed
4 days ago
Verified
0eee877f
[Benchmark] Fix OOM during MoE kernel tuning for large models (#31604)
massif-01
committed
4 days ago
Verified
a0e9ee83
[MoE Refactor] Explicit construct mk for flashinfer bf16 kernel (#31504)
zyongye
committed
4 days ago
Verified
a3f2f409
[MoE Refactor] Split `invoke_fused_moe_kernel` (#31050)
zyongye
committed
4 days ago
Verified
5a468ff7
[MoE] Fix output_shape calculation in Attention layer to handle 3D query inputs (#31596)
AndreasKaratzas
committed
4 days ago
Verified
6ef770df
[BugFix] Support online dense model DP without overhead (#30739)
njhill
committed
4 days ago
Verified
bd877162
CustomOp: test forward dispatch for grouped_topk (#31530)
xinyu-intel
committed
4 days ago
Verified
08f425ba
Add multimodal input method in the documentation (#31601)
labAxiaoming
committed
4 days ago
Verified
a01f2fae
[Bugfix] Fix weight_loader v1 block scale (#31103)
kyuyeunk
committed
5 days ago
Verified
cc410e86
[Bugfix][Hardware][AMD] Fix last_page_len calculation in AITER MLA decode (#31282)
c0de128
committed
5 days ago
Verified
825c2dc1
Remove unused `use_marlin` variable in `Mxfp4MoEMethod` (#31549)
vsourirajan
committed
5 days ago
Verified
1f43c121
[Bugfix] Fix activation quantization for compressed-tensors W4A16 (#31572)
Tmn07
committed
5 days ago
Verified
ca179d0f
[ROCm][CI] Fix ModernBERT token classification test (#31612)
AndreasKaratzas
committed
5 days ago
Verified
013b5408
[Model] Enable LoRA support for tower and connector in LLaVA (#31513)
jayhemnani9910
committed
5 days ago
Verified
5ac55eb3
[Bugfix] Fix block size used in EAGLE slot mapping (#31540)
benchislett
committed
5 days ago
Verified
ea53ca5e
feat: support LoRA for DeepSeek-OCR(Language Model part) (#31569)
zhima771
committed
5 days ago
Verified
27864a85
Older