vllm
56a63717 - [Update] Use FlashInfer fast_decode_plan directly instead of replication (#34687)

Commit
66 days ago
[Update] Use FlashInfer fast_decode_plan directly instead of replication (#34687) Signed-off-by: Andrii <askliar@nvidia.com> Co-authored-by: Andrii <askliar@nvidia.com>
Author
Parents
Loading