server: improve speed of speculative decoding #17808
server: improve speed of speculative decoding
f2f08f84
fix small draft case
cac8d7b2
ngxson
marked this pull request as ready for review 5 days ago
add link to the PR
398ae8db
server : fix generation time measurement
084cec95
server : fix draft acceptance logs (add SRV_CNT, SLT_CNT macros)
f74d1ee9
server : add comment
75be6ba0
ggerganov
approved these changes
on 2025-12-08
Merge branch 'master' into xsn/server_improve_spec
ba5c0b42
Merge branch 'master' into xsn/server_improve_spec
afe25301
add PR to docs
0a63bd80
ngxson
merged
f896d2c3
into master 3 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub