vllm-project/vllm

Pull Requests Commits

Merge remote-tracking branch 'nm/lwilkinson/fix-flashmla-full-cudagraph' into wide_ep_working_branch

tlrmchlsmth committed 306 days ago

f1c9ef3a

fix dp plus full cuda-graph

LucasWilkinson committed 306 days ago

d80a82f9

[Misc] Refactor vllm config str (#21666)

andyxning committed 307 days ago

Verified a9b2a1d7

Fix CUDA permute/unpermute for use with DeepGemm Moe (#17934)

CalebDu committed 307 days ago

Verified 57c22e57

[Refactor] Refactor MOE NVFP4 Code Base: ModelOpt + Compressed Tensor (#21631)

yewentao256 committed 307 days ago

Verified bda9d053

[VLM] Add video support for Intern-S1 (#21671)

Isotr0py committed 307 days ago

Verified 3d847a31

Migrate Florence2ImagePixelInputs to TensorSchema (#21663)

Benji Beck committed 307 days ago

Verified 5f8c9a42

[Misc] add default value for file pattern arg (#21659)

andyxning committed 307 days ago

Verified 1cbf951b

Refactor: Remove numpy dependency from LoggingStatLogger (#20529)

skyloevil committed 307 days ago

Verified a8936e51

[CI/Build][Doc] Clean up more docs that point to old bench scripts (#21667)

yeqcharlotte committed 307 days ago

Verified 01a395e9

Handle non-serializable objects in vllm bench (#21665)

huydhn committed 307 days ago

Verified 971948b8

[VLM] Support HF format Phi-4-MM model (#17121)

Isotr0py committed 307 days ago

Verified eed2f463

Migrate ChameleonImagePixelInputs to TensorSchema (#21657)

Benji Beck committed 307 days ago

Verified 20950b29

Migrate FuyuImagePatchInputs to TensorSchema (#21662)

Benji Beck committed 307 days ago

Verified 3339cba3

Migrate DeepseekVL2ImageInputs to TensorSchema (#21658)

Benji Beck committed 307 days ago

Verified 0b8caf90

Migrate Blip2ImagePixelInputs and Blip2ImageEmbeddingInputs to TensorSchema (#21656)

Benji Beck committed 307 days ago

Verified ccf27cc4

support `torch.compile` for bailing moe (#21664)

jinzhen-lin committed 307 days ago

Verified c6573698

Remove xformers requirement for Mistral-format Pixtral and Mistral3 (#21154)

wenchen76 committed 307 days ago

Verified 6c66f28f

[NVIDIA] Explicitly disable shuffled weights for flashinfer blockscale moe fp8 kernels (#21411)

kaixih committed 308 days ago

Verified de509ae8

[CI/Build][Doc] Move existing benchmark scripts in CI/document/example to vllm bench CLI (#21355)

yeqcharlotte committed 308 days ago

Verified e7c4f9ee

[Bugfix][Apple Silicon] fix missing symbols when build from source on Mac with Apple Silicon (#21380)

zhouyeju committed 308 days ago

Verified 9094d11c

[Refactor] Remove `moe_align_block_size_triton` (#21335)

yewentao256 committed 308 days ago

Verified 56e544f2

[BugFix] Fix shared storage connector load kv only load attention layer (#21428)

david6666666 committed 308 days ago

Verified 97d6c30c

[Misc] Improve memory profiling debug message (#21429)

yeqcharlotte committed 308 days ago

Verified a40a8506

[Bug] Fix `has_flashinfer_moe` Import Error when it is not installed (#21634)

yewentao256 committed 308 days ago

Verified c215f5c8

Support encoder-only models without KV-Cache (#21270)

maxdebayser committed 308 days ago

Verified 1cd6eaba

[Bugfix] Investigate Qwen2-VL failing test (#21527)

Isotr0py committed 308 days ago

Verified f27fdfc3

Migrate AyaVisionImagePixelInputs to TensorSchema for shape validation (#21622)

Benji Beck committed 308 days ago

Verified de10ff0b

Migrate AriaImagePixelInputs to TensorSchema for shape validation (#21620)

Benji Beck committed 308 days ago

Verified 9d197280

[Take 2] Correctly kill vLLM processes after benchmarks (#21646)

huydhn committed 308 days ago

Verified e98def43

Older