transformers
f3d5f255 - [CB] Easy optimizations for continuous batching (#42839)

Commit
1 day ago
[CB] Easy optimizations for continuous batching (#42839) * Cb example more args * Remove useless sync * Better new tokens, and no more BS1 on outputs * Add dynamic to compile to avoid many graphs * Sort prefix to maximize cache hits * More robust ways to retrieve results in test * Style * Update src/transformers/generation/continuous_batching/continuous_api.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Author
Parents
Loading