llama.cpp
server: improve speed of speculative decoding
#17808
Merged

server: improve speed of speculative decoding #17808

ngxson
ngxson server: improve speed of speculative decoding
f2f08f84
github-actions github-actions added examples
github-actions github-actions added server
ngxson fix small draft case
cac8d7b2
ngxson ngxson marked this pull request as ready for review 5 days ago
ngxson ngxson requested a review from ggerganov ggerganov 5 days ago
ngxson
ngxson add link to the PR
398ae8db
theo77186
ggerganov server : fix generation time measurement
084cec95
ggerganov server : fix draft acceptance logs (add SRV_CNT, SLT_CNT macros)
f74d1ee9
ggerganov server : add comment
75be6ba0
ggerganov
ggerganov approved these changes on 2025-12-08
ngxson
ngxson Merge branch 'master' into xsn/server_improve_spec
ba5c0b42
ggerganov
ngxson Merge branch 'master' into xsn/server_improve_spec
afe25301
ngxson add PR to docs
0a63bd80
ngxson ngxson merged f896d2c3 into master 3 days ago
Nindaleth
Nindaleth commented on 2025-12-11

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone