vllm
682566b1
- [Bug] Refactor max_num_batched_tokens to account for drafting (#34898)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 days ago
[Bug] Refactor max_num_batched_tokens to account for drafting (#34898) Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
References
#34898 - [Bug] Refactor max_num_batched_tokens to account for drafting
Author
benchislett
Parents
b9c2a565
Loading