llama.cpp
speculative: fix multimodal MTP seed positions
#22881
Closed

speculative: fix multimodal MTP seed positions #22881

trbom5c wants to merge 12 commits into ggml-org:master from trbom5c:codex/fix-mtp-multimodal
trbom5c
am17an llama: allow partial seq_rm for GDN models for speculative decoding
1a4fe4e6
am17an add enum for part sequence removal to enable checkpoints
589490f0
am17an review: rename rollback to rs_seq and remove public API
c5e02271
am17an llama + spec: MTP support
10829dbc
am17an add qwen35moe_mtp
f8c6b03d
am17an vulkan: add gdn keep_intermediates=true path
b8ec0855
am17an metal: add keep_intermediates=true path for GDN
038d7876
am17an convert: fix python type check
d6c4de87
am17an test-llama-arch: ignore mtp heads
267f8afe
am17an fix double free
86d9f15e
am17an fix: use rs for only MTP
5d5f1b46
fix: seed MTP drafts from logical token positions
5e53e243
JohannesGaessler JohannesGaessler closed this 7 days ago
ggml-gh-bot
github-actions github-actions added model
github-actions github-actions added testing
github-actions github-actions added Nvidia GPU
github-actions github-actions added Vulkan
github-actions github-actions added examples
github-actions github-actions added python
github-actions github-actions added server
github-actions github-actions added ggml
github-actions github-actions added Apple Metal

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone