Commits vllm-project/vllm

[Misc] Allow override VLLM_DISTRIBUTED_INIT_METHOD_OVERRIDE

Woosuk Kwon committed 167 days ago

6f47333c

[V1][Metrics][Plugin] Add plugin support for custom `StatLoggerBase` implementations (#22456)

ptovam committed 168 days ago

Verified 83e760c5

[BugFix] Disable fp8 kv-cache by default for DeepSeek V3.2 (#27121)

LucasWilkinson committed 168 days ago

Verified c2bba690

[BugFix] fix graph partition signature (#27139)

BoyuanFeng committed 168 days ago

Verified e133d6d2

[Chore] Separate out profiling utilities from vllm.utils (#27150)

dongbo910220 committed 168 days ago

Verified a1946c9f

[BugFix] Fix failing gemma-3-1b-it test: `test_lm_eval_accuracy_v1_engine[google/gemma-3-1b-it]` (#27111)

LucasWilkinson committed 168 days ago

Verified 9f020f4f

[Minor] Add some clarifying comments to recent changes (#27130)

njhill committed 168 days ago

Verified 3b450752

Fix incorrect string formatting in barrier timeout exceptions (#27149)

hyongtao-code committed 168 days ago

Verified 168e578e

[Chore] Clean up pytorch helper functions in `vllm.utils` (#26908)

Isotr0py committed 168 days ago

Verified 6ac5e06f

[Models][QwenVL] Remove unnecessary `.contiguous()` calls (#27106)

lgeiger committed 168 days ago

Verified 5c2acb27

[Misc] Refactor `get_kv_cache_spec` into `AttentionLayerBase` (#26587)

NickLucche committed 168 days ago

Verified b26b70be

[fix][cpu] fix prefill attention in CPU attention backend (#27035)

fadara01 committed 168 days ago

Verified ab4be40f

[Feature] Batch Invariant: Support DeepGEMM and Blackwell (#27127)

yewentao256 committed 168 days ago

Verified 245e4f2c

[Chore] Separate out `vllm.utils.mem_utils` (#27143)

iAmir97 committed 168 days ago

Verified 1d165d6d

[Test] Add test for /health endpoint on engine failure (#26074)

dongbo910220 committed 168 days ago

Verified 83004020

[DOC][FEATURES][CPU]update cpu feature for v1 (#27135)

xuechendi committed 168 days ago

Verified 12e21701

[Misc] Rev DeepEP (#27122)

varun-sundar-rabindranath committed 168 days ago

Verified 30a33b92

[GPT-OSS] Structure_Tag support for gpt-oss tool-call in cot (#25515)

Hanchenli committed 168 days ago

Verified 7c572544

[CI/Build] tests(v1): feed Triton attention the (num_blocks, 2, …) KV cache layout in backend-correctness tests (#26663)

hl475 committed 168 days ago

Verified c3123207

[Perf] Add H100 fused MoE config (#25398)

skyloevil committed 168 days ago

Verified c981f0ea

[BugFix][Core] Fix error when enable async-scheduling in multi-node env (#25887)

lhtin committed 169 days ago

Verified 6367bde7

[Test] Make `test_failure` more stable for batch invariance (#27054)

yewentao256 committed 169 days ago

Verified f50cc221

[V1][Spec Decode] Fix greedy temperature detection after sampler refactor (#27077)

Pradyun92 committed 169 days ago

Verified acedc74b

[Minor] Remove unnecessary error message (#27115)

zhuohan123 committed 169 days ago

Verified d29483b5

[Bugfix] Use PIECEWISE cudagraphs on Blackwell if max_model_len > 131072 (#27114)

mgoin committed 169 days ago

Verified 950cf9e5

[Chore] Remove unused `PolyNorm` layer (#27110)

Isotr0py committed 169 days ago

Verified 3125d799

[Bugfix] [AITER] [ROCm] Fix Quark MoE Quant Config and AITER Fused MoE quant type logic (#27029)

vllmellm committed 169 days ago

Verified e33ee23e

[ROCm][Bugfix][Model] Fix illegal memory access when running qwen3_moe models with rms_norm (Qwen3-235B-A22B, Qwen3-30B-A3B, etc.) (#26192)

rasmith committed 169 days ago

Verified b10c64c8

[ROCM] MoE fp4 CK kernel (#26545)

maleksan85 committed 169 days ago

Verified 0925b28a

[CI] Remove forbidden slash (#27112)

NickLucche committed 169 days ago

Verified 99722d5f