[Bugfix] Fix cuda graph sizes when running with speculative decoding #30330
Fix cuda graph bug with spec dec
a954cb18
Update vllm.py
a25c64e0
Update vllm.py
8edd14db
Merge branch 'main' into patryk/cuda-graph-spec-dec-bug
c0f06863
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub