vllm
[BugFix] Add support for MTP num_speculative_tokens > 1 with sparse MLA
#34552
Merged

[BugFix] Add support for MTP num_speculative_tokens > 1 with sparse MLA #34552

LucasWilkinson
LucasWilkinson fix
64f67545
LucasWilkinson LucasWilkinson requested a review from benchislett benchislett 20 days ago
LucasWilkinson LucasWilkinson requested a review from luccafong luccafong 20 days ago
mergify mergify added speculative-decoding
mergify mergify added v1
mergify mergify added bug
gemini-code-assist
gemini-code-assist commented on 2026-02-14
LucasWilkinson clean
d7b5665c
mergify
LucasWilkinson clean
1ca80d1d
mergify
mgoin
jeejeelee
LucasWilkinson
MatthewBonanni
MatthewBonanni Fixes
4e996fac
mergify
MatthewBonanni Fix pre-commit
58ac3cee
mergify
MatthewBonanni Fix pre-commit
a30366fd
MatthewBonanni MatthewBonanni requested a review from pavanimajety pavanimajety 15 days ago
mergify
mergify mergify added needs-rebase
MatthewBonanni Implement flattening
92e86aeb
MatthewBonanni Merge branch 'main' into lwilkinson/fix-glm-5-mtp-more-then-1
1a11906a
MatthewBonanni Restore original comment
de26a81d
mergify mergify removed needs-rebase
MatthewBonanni Fix bad merge
bd4f3e76
MatthewBonanni Handle cudagraphs
488de835
mergify
mergify mergify added needs-rebase
MatthewBonanni MatthewBonanni requested a review from MatthewBonanni MatthewBonanni 9 days ago
MatthewBonanni Merge branch 'main' into lwilkinson/fix-glm-5-mtp-more-then-1
bc79aa1d
mergify mergify removed needs-rebase
LucasWilkinson
LucasWilkinson commented on 2026-02-25
MatthewBonanni Preallocate _expanded_block_table_buffer
617377e4
MatthewBonanni Merge branch 'main' into lwilkinson/fix-glm-5-mtp-more-then-1
b4e451aa
MatthewBonanni Typo
6d2da6d9
LucasWilkinson
LucasWilkinson commented on 2026-02-25
LucasWilkinson
LucasWilkinson commented on 2026-02-25
LucasWilkinson
LucasWilkinson commented on 2026-02-25
LucasWilkinson
LucasWilkinson commented on 2026-02-25
LucasWilkinson
LucasWilkinson commented on 2026-02-25
MatthewBonanni Add explanatory comments
bb809b1b
MatthewBonanni Use query_start_loc
4ee5eb04
MatthewBonanni Preallocate arange
a30b20b8
MatthewBonanni consistent naming
793eb6de
MatthewBonanni Only zero first column
02d952c0
MatthewBonanni Always flatten
ce4d4d21
mgoin mgoin added ready
mergify
mergify mergify added needs-rebase
MatthewBonanni Merge branch 'main' into lwilkinson/fix-glm-5-mtp-more-then-1
10de9177
MatthewBonanni MatthewBonanni requested a review from njhill njhill 7 days ago
mergify mergify removed needs-rebase
mergify
mergify mergify added needs-rebase
LucasWilkinson LucasWilkinson force pushed from 10de9177 to 9fc312d7 5 days ago
mergify mergify removed needs-rebase
mergify
mergify
MatthewBonanni MatthewBonanni force pushed from 15031427 to 10de9177 4 days ago
MatthewBonanni Merge branch 'main' into lwilkinson/fix-glm-5-mtp-more-then-1
d95ee8cb
MatthewBonanni Fix test
05389df2
MatthewBonanni MatthewBonanni changed the title [BugFix] Fix GLM-5 MTP not supporting num_speculative_tokens > 1 [BugFix] Add support for MTP num_speculative_tokens > 1 with sparse MLA 4 days ago
mergify
mergify mergify added needs-rebase
MatthewBonanni Fix tests
5b67f7b9
MatthewBonanni Fix tree attention
8418f835
LucasWilkinson Merge remote-tracking branch 'origin/main' into lwilkinson/fix-glm-5-…
7f24f67a
mergify mergify removed needs-rebase
MatthewBonanni
MatthewBonanni approved these changes on 2026-03-03
vllm-bot vllm-bot merged 28ef9ba3 into main 3 days ago
khluu khluu added this to the v0.17.0 cherry picks milestone 3 days ago
benchislett
benchislett commented on 2026-03-03
mdierolf

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels