vllm
7f280d69
- [Optimization] Cache sampled token ids in model runner (#20291)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
307 days ago
[Optimization] Cache sampled token ids in model runner (#20291) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
References
#20291 - [Optimization] Cache sampled token ids in model runner
Author
WoosukKwon
Parents
02cabff2
Loading