Include speculative decoding stats when timings_per_token is enabled #12603
Include speculative decoding stats when timings_per_token is true
2dc29181
ngxson
approved these changes
on 2025-03-27
Remove redundant draft_accept_ratio var
41a8e85d
add draft acceptance rate to server console output
429820ec
ggerganov
approved these changes
on 2025-03-28
ggerganov
merged
5d016702
into master 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub