[Perf] [Hybrid] Copy num_accepted_tokens in non-blocking way when not using prefix caching #35442
Copy num_accepted_tokens in non-blocking way when not using prefix ca…
5c244276
Use existing self.num_accepted_tokens buffer instead of temporary tensor
a5eae630
tdoublep
force pushed
from
fcb96b91
to
a5eae630
5 days ago
vllm-bot
merged
ad9d09e2
into main 4 days ago
tdoublep
deleted the faster-mtp-no-pc branch 4 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub