llama.cpp
Add self‑speculative decoding (no draft model required)
#18471
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
34
Changes
View On
GitHub
Add self‑speculative decoding (no draft model required)
#18471
ggerganov
merged 34 commits into
ggml-org:master
from
srogmann:feature/self-speculative
srogmann
requested a review
from
ngxson
56 days ago
srogmann
requested a review
from
ggerganov
56 days ago
github-actions
added
examples
github-actions
added
server
ggerganov
commented on 2025-12-31
srogmann
requested a review
from
allozaur
53 days ago
srogmann
requested a review
from
CISC
53 days ago
srogmann
requested a review
from
0cc4m
53 days ago
srogmann
requested a review
from
JohannesGaessler
53 days ago
srogmann
requested a review
from
danbev
53 days ago
srogmann
requested a review
from
pwilkin
53 days ago
CISC
removed review request
from
danbev
53 days ago
CISC
removed review request
from
CISC
53 days ago
CISC
removed review request
from
pwilkin
53 days ago
CISC
removed review request
from
0cc4m
53 days ago
CISC
removed review request
from
JohannesGaessler
53 days ago
CISC
removed review request
from
allozaur
53 days ago
CISC
closed this
53 days ago
CISC
reopened this
53 days ago
github-actions
added
documentation
github-actions
added
model
github-actions
added
script
github-actions
added
testing
github-actions
added
Nvidia GPU
github-actions
added
Vulkan
github-actions
added
python
github-actions
added
devops
github-actions
added
ggml
github-actions
added
SYCL
github-actions
added
Apple Metal
CISC
removed
documentation
CISC
removed
model
CISC
removed
script
CISC
removed
testing
CISC
removed
Nvidia GPU
CISC
removed
Vulkan
CISC
removed
python
CISC
removed
devops
CISC
removed
ggml
CISC
removed
SYCL
CISC
removed
Apple Metal
github-actions
added
documentation
github-actions
added
model
github-actions
added
script
github-actions
added
testing
github-actions
added
Nvidia GPU
github-actions
added
Vulkan
github-actions
added
python
github-actions
added
devops
github-actions
added
ggml
github-actions
added
SYCL
github-actions
added
Apple Metal
CISC
removed
documentation
CISC
removed
model
CISC
removed
script
CISC
removed
testing
CISC
removed
Nvidia GPU
CISC
removed
Vulkan
CISC
removed
python
CISC
removed
devops
CISC
removed
ggml
CISC
removed
SYCL
CISC
removed
Apple Metal
srogmann
force pushed
from
f2299fa1
to
9fee55e2
52 days ago
CISC
commented on 2026-01-02
CISC
commented on 2026-01-02
ggerganov
commented on 2026-01-14
ngxson
commented on 2026-01-14
srogmann
force pushed
from
81748950
to
f0ec5943
38 days ago
ngxson
commented on 2026-01-22
ngxson
commented on 2026-01-22
server: introduce self-speculative decoding
1fb2658b
server: moved self-call into speculative.cpp
1faeb628
can_speculate() includes self-speculation
e3e809cc
server: can_speculate() tests self-spec
38f7c287
server: replace can_speculate() with slot.can_speculate()
917f4bb1
common: use %zu format specifier for size_t in logging
f1f6584c
server: can_speculate() requires a task instance
907d094f
common: ngram map, config self-speculative decoding
456268fa
common: add enum common_speculative_type
b38eb590
common: add vector of speculative states
eb43748b
common: add option --spec-draftless
1e29af4e
server: cleanup (remove slot.batch_spec, rename)
a1584ac8
srogmann
force pushed
from
1761bad1
to
a1584ac8
30 days ago
common: moved self-spec impl to ngram-map
cb3a4027
common: cleanup (use common_speculative_state_draft)
af382c38
spec : refactor
924517dd
ngxson
commented on 2026-01-25
ngxson
commented on 2026-01-25
cont : naming
9ac88176
spec: remove --spec-config
8ea068e5
doc: (draftless) speculative decoding
288ab505
common: print performance in spec decoding
fd4d803c
github-actions
added
documentation
minor : cleanup
f895bca7
common : better names
a3300937
minor : cleanup + fix build
1f8d3666
ggerganov
approved these changes on 2026-01-26
minor: comments
72f416e9
CODEOWNERS: add common/ngram-map.* (#18471)
dd23149d
Merge branch 'master' into pr/18471
351e798b
common : rename speculative.draftless_type -> speculative.type
bc338380
ngram-map : fix uninitialized values
9f8401a5
ngram-map : take into account the input can become shorter
003c9035
ggerganov
commented on 2026-01-27
ggerganov
commented on 2026-01-27
ngram-map : revert len check for now
7164b4ff
arg : change `--spec-draftless` -> `--spec-type`
c1ff133f
spec : add common_speculative_state::accept()
606ff8f0
spec : refactor + add common_speculative_begin()
aba472e8
spec : fix begin() call with mtmd
6fc7dfc5
spec : additional refactor + remove common_speculative_params
45da93e3
ggerganov
merged
72d3b189
into master
26 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
ngxson
CISC
Assignees
No one assigned
Labels
documentation
examples
server
Milestone
No milestone
Login to write a write a comment.
Login via GitHub