Trtllm backend improvements #3231

leejuyuu wants to merge 13 commits into huggingface:main from leejuyuu:trtllm
leejuyuu
leejuyuu feat(trtllm): add new finish reasons
1b9c420c
leejuyuu fix: fix prometheus_port CLI short arg conflict
592c3c79
leejuyuu fix(trtllm): fix segfault when canceling request
4e0c82fe
leejuyuu feat(trtllm): add stop sequence support
79de1c2c
leejuyuu feat(trtllm): catch broader exception
b157cd00
leejuyuu feat(trtllm): check existence of config files
8c4a14e3
leejuyuu fix(trtllm): fix do_sample being ignored
c170c662
leejuyuu feat(trtllm): get more accurate start time
ee82a085
leejuyuu perf(trtllm): reduce futile loop iterations
23b78029
leejuyuu refactor: add interior mutability to tensorrt_llm_backend_t
161f62e0
leejuyuu feat(trtllm): separate request and response loop
e2b0063c
leejuyuu fix(trtllm): handle single eos_token_id in generation_config
34307a42
leejuyuu feat(trtllm): support guided decoding
bf040884
leejuyuu leejuyuu force pushed from 16c764e2 to bf040884 70 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone