llama.cpp
speculative: fix multimodal MTP seed positions
#22881
Closed
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
12
Changes
View On
GitHub
speculative: fix multimodal MTP seed positions
#22881
trbom5c
wants to merge 12 commits into
ggml-org:master
from
trbom5c:codex/fix-mtp-multimodal
llama: allow partial seq_rm for GDN models for speculative decoding
1a4fe4e6
add enum for part sequence removal to enable checkpoints
589490f0
review: rename rollback to rs_seq and remove public API
c5e02271
llama + spec: MTP support
10829dbc
add qwen35moe_mtp
f8c6b03d
vulkan: add gdn keep_intermediates=true path
b8ec0855
metal: add keep_intermediates=true path for GDN
038d7876
convert: fix python type check
d6c4de87
test-llama-arch: ignore mtp heads
267f8afe
fix double free
86d9f15e
fix: use rs for only MTP
5d5f1b46
fix: seed MTP drafts from logical token positions
5e53e243
JohannesGaessler
closed this
7 days ago
github-actions
added
model
github-actions
added
testing
github-actions
added
Nvidia GPU
github-actions
added
Vulkan
github-actions
added
examples
github-actions
added
python
github-actions
added
server
github-actions
added
ggml
github-actions
added
Apple Metal
Login to write a write a comment.
Login via GitHub
Reviewers
JohannesGaessler
ggerganov
CISC
Assignees
No one assigned
Labels
model
testing
Nvidia GPU
Vulkan
examples
python
server
ggml
Apple Metal
Milestone
No milestone
Login to write a write a comment.
Login via GitHub