Trtllm backend improvements #3231
feat(trtllm): add new finish reasons
1b9c420c
fix: fix prometheus_port CLI short arg conflict
592c3c79
fix(trtllm): fix segfault when canceling request
4e0c82fe
feat(trtllm): add stop sequence support
79de1c2c
feat(trtllm): catch broader exception
b157cd00
feat(trtllm): check existence of config files
8c4a14e3
fix(trtllm): fix do_sample being ignored
c170c662
feat(trtllm): get more accurate start time
ee82a085
perf(trtllm): reduce futile loop iterations
23b78029
refactor: add interior mutability to tensorrt_llm_backend_t
161f62e0
feat(trtllm): separate request and response loop
e2b0063c
fix(trtllm): handle single eos_token_id in generation_config
34307a42
feat(trtllm): support guided decoding
bf040884
leejuyuu
force pushed
from
16c764e2
to
bf040884
70 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub