vllm
[Frontend] Set server's maximum number of generated tokens using generation_config.json
#12242
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
34
Changes
View On
GitHub
[Frontend] Set server's maximum number of generated tokens using generation_config.json
#12242
youkaichao
merged 34 commits into
vllm-project:main
from main
mergify
added
frontend
DarkLight1337
changed the title
Enable setting server's maximum number of generated tokens using generation_config.json
[Frontend] Set server's maximum number of generated tokens using generation_config.json
333 days ago
mhendrey
requested a review
from
DarkLight1337
331 days ago
mhendrey
requested a review
from
robertgshaw2-redhat
331 days ago
mhendrey
requested a review
from
simon-mo
331 days ago
Adding max_new_tokens support to generation_config.json
5c85448d
Changed default_max_tokens to server_max_tokens
4ad6b45c
Renamed default_max_tokens to server_max_tokens
95f9c973
Removed the float("inf") bug
4786e563
Renamed default_max_tokens to server_max_tokens
4980a73f
Rearranged lines to make the changes with existing as small as possible
39d7d767
Limit generated tokens by server's max_tokens setting when available
b6a24c47
Changed syntax to pass format.sh tests
aa7cff13
[Bugfix] Fix num_heads value for simple connector when tp enabled (#1…
2f6e43be
[torch.compile] fix sym_tensor_indices (#12191)
6baa0ea5
Move linting to `pre-commit` (#11975)
35b59487
[DOC] Fix typo in docstring and assert message (#12194)
0c2f332e
[DOC] Add missing docstring in LLMEngine.add_request() (#12195)
46249e5f
[Bugfix] Fix incorrect types in LayerwiseProfileResults (#12196)
0b2e3de3
[Model] Add Qwen2 PRM model support (#12202)
090eca3c
[Core] Interface for accessing model from `VllmRunner` (#10353)
5d36c1fd
[misc] add placeholder format.sh (#12206)
df331a75
[CI/Build] Remove dummy CI steps (#12208)
881964d0
[CI/Build] Make pre-commit faster (#12212)
5cc6a09f
[Model] Upgrade Aria to transformers 4.48 (#12203)
9f3d5a68
[misc] print a message to suggest how to bypass commit hooks (#12217)
957ca23c
[core][bugfix] configure env var during import vllm (#12209)
399d224c
[V1] Remove `_get_cache_block_size` (#12214)
df065037
[Misc] Pass `attention` to impl backend (#12218)
b89529bf
[Bugfix] Fix `HfExampleModels.find_hf_info` (#12223)
a5d57f1e
[CI] Pass local python version explicitly to pre-commit mypy.sh (#12224)
b1af379f
Added tests to check max_tokens is properly set
0e3a719f
mhendrey
requested a review
from
mgoin
331 days ago
mhendrey
requested a review
from
ywang96
331 days ago
mhendrey
requested a review
from
WoosukKwon
331 days ago
mhendrey
requested a review
from
njhill
331 days ago
mhendrey
requested a review
from
comaniac
331 days ago
mhendrey
requested a review
from
alexm-redhat
331 days ago
mhendrey
requested a review
from
zhuohan123
331 days ago
mhendrey
requested a review
from
youkaichao
331 days ago
mergify
added
documentation
mergify
added
ci/build
mergify
added
needs-rebase
mhendrey
closed this
331 days ago
Merge branch 'server_max_tokens'
6867b374
Mucked up the rebasing. Fixing that now.
99243cf6
mhendrey
reopened this
331 days ago
mergify
removed
needs-rebase
DarkLight1337
removed review request
from
mgoin
331 days ago
DarkLight1337
removed review request
from
comaniac
331 days ago
DarkLight1337
removed review request
from
njhill
331 days ago
DarkLight1337
removed review request
from
zhuohan123
331 days ago
DarkLight1337
removed review request
from
youkaichao
331 days ago
DarkLight1337
removed review request
from
WoosukKwon
331 days ago
DarkLight1337
removed review request
from
alexm-redhat
331 days ago
DarkLight1337
removed review request
from
ywang96
331 days ago
DarkLight1337
commented on 2025-01-23
Reverting the serving_chat & serving_completion back and putting all …
1a15431a
Didn't quite revert back. Deleting empty line from both
c10eb1f3
DarkLight1337
commented on 2025-01-23
Changed to using one-liner and edited engine arg for generation-config
a3fc62b4
Merge branch 'vllm-project:main' into main
98949f68
Converted to a one-liner for taking minimum value & added to generati…
c71f429d
DarkLight1337
approved these changes on 2025-01-25
DarkLight1337
enabled auto-merge (squash)
329 days ago
github-actions
added
ready
disabled auto-merge
328 days ago
Manually disabled by user
youkaichao
merged
9ddc3522
into main
328 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
DarkLight1337
robertgshaw2-redhat
simon-mo
Assignees
No one assigned
Labels
documentation
frontend
ready
ci/build
Milestone
No milestone
Login to write a write a comment.
Login via GitHub