vllm-project/vllm

Pull Requests Commits

ywang96 committed 1 year ago

d3eddd6e

[V1][Spec Decode] Implement Eagle Proposer [1/N] (#15729)

WoosukKwon committed 1 year ago

Verified e75a6301

[V1][Metrics] Initial speculative decoding metrics (#15151)

markmc committed 1 year ago

Verified a79cc68b

[CI] Disable flaky structure decoding test temporarily. (#15892)

ywang96 committed 1 year ago

Verified 7e3f7a4e

[Model] Add module name prefixes to gemma3 (#15889)

cloud11665 committed 1 year ago

Verified 9ec82579

[Model] Aya Vision (#15441)

JenZhao committed 1 year ago

Verified 38327cf4

[CI/Build] Clean up LoRA tests (#15867)

jeejeelee committed 1 year ago

Verified dfa82e2a

Add option to use DeepGemm contiguous grouped gemm kernel for fused MoE operations. (#13932)

bnellnm committed 1 year ago

Verified e59ca942

[ROCm][Build][Bugfix] Bring the base dockerfile in sync with the ROCm fork (#15820)

gshtras committed 1 year ago

Verified a57a3044

[Misc] Allow using OpenCV as video IO fallback (#15055)

Isotr0py committed 1 year ago

Verified 4e5a0f6a

Reinstate `format.sh` and make `pre-commit` installation simpler (#15890)

hmellor committed 1 year ago

Verified b63bd149

[Doc] Quark quantization documentation (#15861)

chaow-amd committed 1 year ago

Verified 2041c0e3

[New Model]: jinaai/jina-reranker-v2-base-multilingual (#15876)

noooop committed 1 year ago

Verified 085cbc4f

Remove `format.sh` as it's been unsupported >70 days (#15884)

hmellor committed 1 year ago

Verified 2b93162f

[Misc] remove unused script (#15746)

reidliu41 committed 1 year ago

Verified 2e45bd29

[Model] Support Mistral3 in the HF Transformers format (#15505)

mgoin committed 1 year ago

Verified 51d7c6a2

setup correct nvcc version with CUDA_HOME (#15725)

Yang Chen committed 1 year ago

Verified f3aca1ee

[Misc] Use envs.VLLM_USE_RAY_COMPILED_DAG_CHANNEL_TYPE (#15831)

ruisearch42 committed 1 year ago

Verified 8dd41d6b

[Bugfix] Fix no video/image profiling edge case for `MultiModalDataParser` (#15828)

Isotr0py committed 1 year ago

Verified 0a298ea4

[Docs] Fix small error in link text (#15868)

hmellor committed 1 year ago

Verified d330558b

[Misc] Fix speculative config repr string (#15860)

ShangmingCai committed 1 year ago

Verified 656fd729

[Misc] Enable V1 LoRA by default (#15320)

varun-sundar-rabindranath committed 1 year ago

Verified 79455cf4

[Feature] specify model in config.yaml (#15798)

wayzeng committed 1 year ago

Verified 30d6a015

fix: can not use uv run collect_env close #13888 (#15792)

yihong0618 committed 1 year ago

Verified 8af5a5c4

[V1] Implement sliding window attention in kv_cache_manager (#14097)

heheda12345 committed 1 year ago

Verified 3a5f0afc

[ROCm] Use device name in the warning (#15838)

gshtras committed 1 year ago

Verified c7e63aa4

[sleep mode] clear pytorch cache after sleep (#15248)

lionelvillard committed 1 year ago

Verified 4a9ce178

[V1] TPU - Fix fused MOE (#15834)

alexm-redhat committed 1 year ago

Verified 7e4e709b

[Bugfix]: Fix is_embedding_layer condition in VocabParallelEmbedding (#15824)

alexwl committed 1 year ago

Verified 63d8eabe

[Bugfix] Fix extra comma (#15851)

haochengxia committed 1 year ago

Verified e830b013

Older