vllm
[Perf] [Hybrid] Copy num_accepted_tokens in non-blocking way when not using prefix caching
#35442

Merged

[Perf] [Hybrid] Copy num_accepted_tokens in non-blocking way when not using prefix caching #35442

vllm-bot merged 2 commits into vllm-project:main from tdoublep:faster-mtp-no-pc

mergify added v1

gemini-code-assist commented on 2026-02-26

heheda12345 commented on 2026-02-27

heheda12345 approved these changes on 2026-03-03

heheda12345 enabled auto-merge (squash) 5 days ago

github-actions added ready

Copy num_accepted_tokens in non-blocking way when not using prefix ca…

5c244276

Use existing self.num_accepted_tokens buffer instead of temporary tensor

a5eae630

tdoublep force pushed from fcb96b91 to a5eae630 5 days ago

tdoublep requested a review from

njhill 5 days ago

vllm-bot merged ad9d09e2 into main 4 days ago

tdoublep deleted the faster-mtp-no-pc branch 4 days ago

Reviewers

heheda12345

gemini-code-assist

njhill

Assignees

No one assigned

Labels

ready v1

Milestone

No milestone