vllm
7f280d69 - [Optimization] Cache sampled token ids in model runner (#20291)

Commit
307 days ago
[Optimization] Cache sampled token ids in model runner (#20291) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Author
Parents
Loading