llama.cpp
Include speculative decoding stats when timings_per_token is enabled
#12603
Merged

Include speculative decoding stats when timings_per_token is enabled #12603

ggerganov merged 3 commits into ggml-org:master from mostlygeek:master
mostlygeek
mostlygeek Include speculative decoding stats when timings_per_token is true
2dc29181
mostlygeek mostlygeek requested a review from ngxson ngxson 1 year ago
github-actions github-actions added examples
github-actions github-actions added server
ngxson
ngxson approved these changes on 2025-03-27
ngxson
ggerganov
ggerganov commented on 2025-03-27
jukofyork
mostlygeek Remove redundant draft_accept_ratio var
41a8e85d
mostlygeek
mostlygeek add draft acceptance rate to server console output
429820ec
mostlygeek
mostlygeek mostlygeek closed this 1 year ago
mostlygeek mostlygeek reopened this 1 year ago
ggerganov
ggerganov
ggerganov approved these changes on 2025-03-28
mostlygeek mostlygeek requested a review from ggerganov ggerganov 1 year ago
ggerganov ggerganov merged 5d016702 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone