vllm
Add `--max-model-len auto` to auto-fit context to available memory
#29431
Merged

Add `--max-model-len auto` to auto-fit context to available memory #29431

mgoin
mgoin Auto-fit max_model_len
1575490e
mgoin mgoin requested a review from WoosukKwon WoosukKwon 167 days ago
mgoin mgoin requested a review from robertgshaw2-redhat robertgshaw2-redhat 167 days ago
mgoin mgoin requested a review from njhill njhill 167 days ago
mgoin mgoin requested a review from ywang96 ywang96 167 days ago
mgoin mgoin requested a review from alexm-redhat alexm-redhat 167 days ago
mgoin mgoin requested a review from heheda12345 heheda12345 167 days ago
mgoin mgoin requested a review from ApostaC ApostaC 167 days ago
mgoin mgoin requested a review from youkaichao youkaichao 167 days ago
mgoin mgoin requested a review from tlrmchlsmth tlrmchlsmth 167 days ago
mgoin mgoin requested a review from houseroad houseroad 167 days ago
mgoin mgoin requested a review from hmellor hmellor 167 days ago
mgoin mgoin requested a review from yewentao256 yewentao256 167 days ago
mgoin mgoin requested a review from ProExpertProg ProExpertProg 167 days ago
mergify mergify added v1
gemini-code-assist
gemini-code-assist commented on 2025-11-25
mgoin collective_rpc("update_max_model_len")
c02f06e5
mgoin mgoin changed the title Auto-fit max_model_len Add `--max-model-len -1` to auto-fit context length to GPU memory 167 days ago
mgoin mgoin changed the title Add `--max-model-len -1` to auto-fit context length to GPU memory Add `--max-model-len -1` to auto-fit context to available memory 167 days ago
mgoin
mgoin mgoin added feature request
mgoin mgoin added startup-ux
mgoin Remove SchedulerConfig.max_model_len
956920bb
chatgpt-codex-connector
mgoin mgoin requested a review from noooop noooop 167 days ago
njhill
mgoin Fix TP when hybrid manager is disabled
7fe724f7
NickLucche
NickLucche commented on 2025-11-25
mgoin Support "auto" as well
11fe63af
mgoin mgoin changed the title Add `--max-model-len -1` to auto-fit context to available memory Add `--max-model-len auto` to auto-fit context to available memory 167 days ago
mgoin mgoin added ready
heheda12345
heheda12345 commented on 2025-11-25
heheda12345
mgoin Restructure so specs merge first
cd32a22a
mgoin mgoin requested a review from heheda12345 heheda12345 167 days ago
mgoin mgoin requested a review from NickLucche NickLucche 167 days ago
NickLucche
NickLucche approved these changes on 2025-12-05
mgoin Merge branch 'main' into auto-fit-max-model-len
9fc95795
mgoin Fix estimate_max_model_len side effect
b7bff7f9
mgoin Remove dupe assign
90dd4797
heheda12345
heheda12345 commented on 2025-12-10
mgoin Account for hybrid model padding
3238f220
mgoin Pull out search into function
ed4d76f5
mgoin Merge branch 'main' into auto-fit-max-model-len
e2fb0aba
mgoin
mgoin mgoin added ready-run-all-tests
vllm-bot vllm-bot merged 8ee90c83 into main 139 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone