llama.cpp
llama-server : implement universal assisted decoding
#12635
Merged

llama-server : implement universal assisted decoding #12635

CISC merged 27 commits into ggml-org:master from g2mt:master
g2mt
g2mt llama-server : implement universal assisted decoding
6f962699
g2mt g2mt requested a review from ngxson ngxson 1 year ago
github-actions github-actions added examples
github-actions github-actions added server
g2mt Merge branch 'master' into master
6f74c9c4
jukofyork
g2mt Merge remote-tracking branch 'fork/master' into universal-decoding
e6676458
g2mt Erase prompt tail for kv-cache
ff9e0623
g2mt set vocab_dft_compatible in common_speculative
39ca594a
g2mt rename ctx_main to ctx_tgt
eb424dd6
g2mt move vocab_dft_compatible to spec struct
2550f11f
g2mt clear mem_dft, remove mem
3c35c9d9
g2mt detokenize id_last for incompatible models
12751c9d
g2mt update comment
84199317
g2mt add --spec-replace flag
b9fdf203
g2mt accept special tokens when translating between draft/main models
160769de
g2mt g2mt requested a review from JohannesGaessler JohannesGaessler 329 days ago
g2mt g2mt requested a review from ggerganov ggerganov 329 days ago
github-actions github-actions added documentation
github-actions github-actions added build
github-actions github-actions added script
github-actions github-actions added testing
github-actions github-actions added android
github-actions github-actions added Nvidia GPU
github-actions github-actions added Vulkan
github-actions github-actions added python
github-actions github-actions added devops
github-actions github-actions added ggml
github-actions github-actions added SYCL
github-actions github-actions added Apple Metal
github-actions github-actions added Ascend NPU
g2mt Merge remote-tracking branch 'upstream/master'
ebaa82ec
g2mt g2mt closed this 329 days ago
Mushoz
g2mt
g2mt g2mt reopened this 329 days ago
g2mt g2mt marked this pull request as draft 329 days ago
g2mt
g2mt g2mt marked this pull request as ready for review 329 days ago
g2mt Escape spec-replace
d1f32aba
g2mt Merge branch 'ggml-org:master' into master
3afb5567
g2mt clamp draft result to size to params.n_draft
d23892ec
g2mt Merge branch 'ggml-org:master' into master
c382c281
CISC CISC removed documentation
CISC CISC removed build
CISC CISC removed script
CISC CISC removed testing
CISC CISC removed android
CISC CISC removed Nvidia GPU
CISC CISC removed Vulkan
CISC CISC removed python
CISC CISC removed devops
CISC CISC removed ggml
CISC CISC removed SYCL
CISC CISC removed Apple Metal
CISC CISC removed Ascend NPU
g2mt Merge branch 'ggml-org:master' into master
e14bafb4
g2mt Merge branch 'ggml-org:master' into master
f8cee4e0
CISC CISC requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 299 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2025-07-28
g2mt fix comment
2cc9e2e1
g2mt clean up code
829b7624
CISC
CISC approved these changes on 2025-07-29
CISC CISC removed review request from ggerganov ggerganov 298 days ago
CISC CISC removed review request from ngxson ngxson 298 days ago
CISC CISC removed review request from JohannesGaessler JohannesGaessler 298 days ago
CISC CISC requested a review from ggerganov ggerganov 298 days ago
g2mt g2mt force pushed to 829b7624 298 days ago
g2mt restore old example
b045eac6
g2mt g2mt force pushed to 79d2be41 298 days ago
g2mt log common_speculative_are_compatible in speculative example
79d2be41
g2mt fix
6acc6814
ggerganov
ggerganov commented on 2025-07-30
g2mt Update common/speculative.cpp
50908f29
g2mt Update common/speculative.cpp
e866f230
g2mt Update common/speculative.cpp
24cede7e
CISC CISC merged 94933c8c into master 297 days ago
ggerganov
CISC
CISC
CISC
g2mt
CISC

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone