llama.cpp
Add self‑speculative decoding (no draft model required)
#18471
Merged

Add self‑speculative decoding (no draft model required) #18471

srogmann
srogmann srogmann requested a review from ngxson ngxson 56 days ago
srogmann srogmann requested a review from ggerganov ggerganov 56 days ago
github-actions github-actions added examples
github-actions github-actions added server
congson1293
malaiwah
ggerganov
JohannesGaessler
srogmann
ggerganov
ggerganov commented on 2025-12-31
srogmann srogmann requested a review from allozaur allozaur 53 days ago
srogmann srogmann requested a review from CISC CISC 53 days ago
srogmann srogmann requested a review from 0cc4m 0cc4m 53 days ago
srogmann srogmann requested a review from JohannesGaessler JohannesGaessler 53 days ago
srogmann srogmann requested a review from danbev danbev 53 days ago
srogmann srogmann requested a review from pwilkin pwilkin 53 days ago
CISC CISC removed review request from danbev danbev 53 days ago
CISC CISC removed review request from CISC CISC 53 days ago
CISC CISC removed review request from pwilkin pwilkin 53 days ago
CISC CISC removed review request from 0cc4m 0cc4m 53 days ago
CISC CISC removed review request from JohannesGaessler JohannesGaessler 53 days ago
CISC CISC removed review request from allozaur allozaur 53 days ago
CISC CISC closed this 53 days ago
CISC CISC reopened this 53 days ago
github-actions github-actions added documentation
github-actions github-actions added model
github-actions github-actions added script
github-actions github-actions added testing
github-actions github-actions added Nvidia GPU
github-actions github-actions added Vulkan
github-actions github-actions added python
github-actions github-actions added devops
github-actions github-actions added ggml
github-actions github-actions added SYCL
github-actions github-actions added Apple Metal
CISC CISC removed documentation
CISC CISC removed model
CISC CISC removed script
CISC CISC removed testing
CISC CISC removed Nvidia GPU
CISC CISC removed Vulkan
CISC CISC removed python
CISC CISC removed devops
CISC CISC removed ggml
CISC CISC removed SYCL
CISC CISC removed Apple Metal
github-actions github-actions added documentation
github-actions github-actions added model
github-actions github-actions added script
github-actions github-actions added testing
github-actions github-actions added Nvidia GPU
github-actions github-actions added Vulkan
github-actions github-actions added python
github-actions github-actions added devops
github-actions github-actions added ggml
github-actions github-actions added SYCL
github-actions github-actions added Apple Metal
CISC CISC removed documentation
CISC CISC removed model
CISC CISC removed script
CISC CISC removed testing
CISC CISC removed Nvidia GPU
CISC CISC removed Vulkan
CISC CISC removed python
CISC CISC removed devops
CISC CISC removed ggml
CISC CISC removed SYCL
CISC CISC removed Apple Metal
CISC
srogmann srogmann force pushed from f2299fa1 to 9fee55e2 52 days ago
CISC
CISC commented on 2026-01-02
CISC
CISC commented on 2026-01-02
srogmann
ggerganov
ggerganov commented on 2026-01-14
ngxson
ngxson commented on 2026-01-14
srogmann
ngxson
ggerganov
srogmann
srogmann srogmann force pushed from 81748950 to f0ec5943 38 days ago
srogmann
srogmann
srogmann
ngxson
ngxson commented on 2026-01-22
ngxson
ngxson commented on 2026-01-22
srogmann
srogmann
srogmann server: introduce self-speculative decoding
1fb2658b
srogmann server: moved self-call into speculative.cpp
1faeb628
srogmann can_speculate() includes self-speculation
e3e809cc
srogmann server: can_speculate() tests self-spec
38f7c287
srogmann server: replace can_speculate() with slot.can_speculate()
917f4bb1
srogmann common: use %zu format specifier for size_t in logging
f1f6584c
srogmann server: can_speculate() requires a task instance
907d094f
srogmann common: ngram map, config self-speculative decoding
456268fa
srogmann common: add enum common_speculative_type
b38eb590
srogmann common: add vector of speculative states
eb43748b
srogmann common: add option --spec-draftless
1e29af4e
srogmann server: cleanup (remove slot.batch_spec, rename)
a1584ac8
srogmann srogmann force pushed from 1761bad1 to a1584ac8 30 days ago
srogmann common: moved self-spec impl to ngram-map
cb3a4027
srogmann common: cleanup (use common_speculative_state_draft)
af382c38
ggerganov
ggerganov spec : refactor
924517dd
ngxson
ngxson commented on 2026-01-25
ngxson
ngxson commented on 2026-01-25
ggerganov
ggerganov cont : naming
9ac88176
ngxson
srogmann spec: remove --spec-config
8ea068e5
srogmann doc: (draftless) speculative decoding
288ab505
srogmann common: print performance in spec decoding
fd4d803c
github-actions github-actions added documentation
srogmann
ggerganov minor : cleanup
f895bca7
ggerganov common : better names
a3300937
ggerganov minor : cleanup + fix build
1f8d3666
ggerganov
ggerganov approved these changes on 2026-01-26
ggerganov
ngxson
srogmann minor: comments
72f416e9
srogmann CODEOWNERS: add common/ngram-map.* (#18471)
dd23149d
ggerganov Merge branch 'master' into pr/18471
351e798b
ggerganov common : rename speculative.draftless_type -> speculative.type
bc338380
ggerganov
ggerganov ngram-map : fix uninitialized values
9f8401a5
ggerganov ngram-map : take into account the input can become shorter
003c9035
ggerganov
ggerganov commented on 2026-01-27
ggerganov
ggerganov commented on 2026-01-27
srogmann
ggerganov ngram-map : revert len check for now
7164b4ff
ggerganov arg : change `--spec-draftless` -> `--spec-type`
c1ff133f
ggerganov spec : add common_speculative_state::accept()
606ff8f0
ggerganov spec : refactor + add common_speculative_begin()
aba472e8
ggerganov
ggerganov spec : fix begin() call with mtmd
6fc7dfc5
ggerganov spec : additional refactor + remove common_speculative_params
45da93e3
ggerganov ggerganov merged 72d3b189 into master 26 days ago
Panchovix
MrHills-rs

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone