vllm
682566b1 - [Bug] Refactor max_num_batched_tokens to account for drafting (#34898)

Commit
2 days ago
[Bug] Refactor max_num_batched_tokens to account for drafting (#34898) Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
Author
Parents
Loading