Pull Requests vllm-project/vllm

[ROCm][CI] Set VLLM_FLOAT32_MATMUL_PRECISION="tf32" For terratorch Tests In AMD CI rocm ready ci/build

#31242 by micah-wil was merged 2025-12-24 03:21

Revert "[bench] Support common prefix len config (for decode-only bench)" performance

#31240 by minosfuture was merged 2025-12-24 05:17

[ROCm][CI][Bugfix] Fix Siglip2 rotary embedding dispatch and InternVL video test tolerance rocm ready multi-modality

#31235 by AndreasKaratzas was merged 2025-12-24 02:56

docs: Add llm-d integration to the website documentation ready

#31234 by terrytangyuan was merged 2025-12-23 20:27

[Bugfix] Enable `dynamic_dims` for different embeds shape ready multi-modality qwen

#31223 by DarkLight1337 was merged 2025-12-24 02:15

[Chore] Simplify logic of `_execute_mm_encoder` ready v1 multi-modality

#31222 by DarkLight1337 was merged 2025-12-24 02:15

Only patch `original_max_position_embeddings` for Transformers v4 ready

#31214 by hmellor was merged 2025-12-23 16:46

Correct position of docstring of class attributes v1

#31209 by wdhongtw was merged 2025-12-23 10:08

[Misc] Introduce `encode_*_url` utility function tpu ready v1 multi-modality kv-connector

#31208 by DarkLight1337 was merged 2025-12-23 13:45

[ROCm][Bugfix] Fix RuntimeError in MMEncoderAttention by replacing .view() with .reshape() rocm ready multi-modality

#31203 by AndreasKaratzas was merged 2025-12-23 21:48

[Bugfix] Fix Jais2ForCausalLM ready

#31198 by jeejeelee was merged 2025-12-23 07:44

Revert "[SM100] Enable fp8 compute for prefill MLA (#30746)" v1 nvidia

#31197 by pavanimajety was merged 2025-12-23 02:15

[ci] Fix Pytorch compilation test oom in 2.10 ready

#31194 by angelayi was merged 2025-12-23 01:56

[AMD][CI] fix v1/engine test_preprocess_error_handling rocm ready v1

#31192 by divakar-amd was merged 2025-12-23 01:28

[Transformers][Bugfix] Migrated to new transformers nightly logic

#31190 by AndreasKaratzas was closed 2025-12-23 00:04

[CI Failure] Disable mosaicml/mpt-7b and databricks/dbrx-instruct tests documentation ready ci-failure

#31182 by mgoin was merged 2025-12-22 23:40

[Doc] Add vllm-metal to hardware plugin documentation documentation ready cpu

#31174 by mgoin was merged 2025-12-22 20:06

[Bug] Fix `'CutlassMLAImpl' object has no attribute '_workspace_buffer'` ready v1 nvidia

#31173 by yewentao256 was merged 2025-12-22 22:24

[Perf] Remove blocking copy in GDN Attention performance ready v1

#31167 by benchislett was merged 2025-12-22 22:25

Fix prefill trace warmup documentation frontend ci/build v1

#31163 by sraizada-tt was closed 2025-12-22 16:36

[Bugfix] Fix MoE LoRA bin/pt loading ready

#31161 by jeejeelee was merged 2025-12-23 11:09

[Bug] Fix `Number of dimensions of tensors must match.` for Deepseek V3.2 ready deepseek

#31160 by yewentao256 was merged 2025-12-24 02:41

[ROCm][CI/Build] Fix triton version to one that has triton_kernels required for gpt-oss to run rocm ready ci/build gpt-oss

#31159 by gshtras was merged 2025-12-22 17:19

[Bugfix][ROCm] Fix typo: triton_fp4_gemm_dynamic_qaunt -> quant rocm

#31157 by c0de128 was closed 2025-12-22 21:11

[ROCm] [Critical]: Remove unused variable rocm ready

#31156 by tjtanaa was merged 2025-12-22 16:28

[Chore] Update more locations to use `attention_config.backend` performance ready

#31153 by DarkLight1337 was merged 2025-12-23 03:19

[CI][Bugfix] Fix `entrypoints/openai/test_audio.py` ready

#31151 by NickLucche was merged 2025-12-22 15:21

[Bugfix][ROCm][Dynamo][DS 3.1][FP8] fix unsupported hasattr call when Dynamo tracing for ROCm device rocm ready

#31149 by zejunchen-zejun was merged 2025-12-24 05:32

Add util function for checking nesting of rope parameters ready

#31146 by hmellor was merged 2025-12-23 11:41

Add encode time documentation performance new-model rocm frontend tpu speculative-decoding needs-rebase ci/build v1 multi-modality llama qwen deepseek cpu gpt-oss kv-connector nvidia

#31143 by LJH-LBJ was closed 2025-12-22 12:08