PR #22881 speculative: fix multimodal MTP seed positions

speculative: fix multimodal MTP seed positions #22881

trbom5c wants to merge 12 commits into ggml-org:master from trbom5c:codex/fix-mtp-multimodal

llama: allow partial seq_rm for GDN models for speculative decoding

1a4fe4e6

add enum for part sequence removal to enable checkpoints

589490f0

review: rename rollback to rs_seq and remove public API

c5e02271

llama + spec: MTP support

10829dbc

add qwen35moe_mtp

f8c6b03d

vulkan: add gdn keep_intermediates=true path

b8ec0855

metal: add keep_intermediates=true path for GDN

038d7876

convert: fix python type check

d6c4de87

test-llama-arch: ignore mtp heads

267f8afe

fix double free

86d9f15e

fix: use rs for only MTP

5d5f1b46

fix: seed MTP drafts from logical token positions

5e53e243

JohannesGaessler closed this 7 days ago

github-actions added model

github-actions added testing

github-actions added Nvidia GPU

github-actions added Vulkan

github-actions added examples

github-actions added python

github-actions added server

github-actions added ggml

github-actions added Apple Metal

Reviewers

JohannesGaessler

ggerganov

CISC

Assignees

No one assigned

Labels

model testing Nvidia GPU Vulkan examples python server ggml Apple Metal

Milestone

No milestone

llama.cpp speculative: fix multimodal MTP seed positions #22881 Closed

speculative: fix multimodal MTP seed positions #22881

llama.cpp
speculative: fix multimodal MTP seed positions
#22881

Closed