[serving] Fix continuous batching JSON response serialization #45057
Fix continuous batching JSON response serialization
b126ffdc
add example script eval-job
1a9bee08
fix script
5ee9fcff
Add test for continuous batching non-streaming JSON response
f86120cc
fix ci
92899701
Update eval script to use official transformers repo main branch
5fddeb2a
NathanHB
force pushed
from
ec55108e
to
5fddeb2a
6 days ago
add kernels and flash attn 2
db4a7746
Add continuous batching configuration CLI arguments to serve command
e5fd8cc0
Add thread lock for manager creation to avoid double manager
f9729fa6
change transformers dep
9f7e1841
NathanHB
merged
a91232af
into main 5 days ago
NathanHB
deleted the fix-continuous-batching-json-response branch 5 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub