Dynamic number of speculative tokens in order to accelerate speculative decoding #33258
optimal Speculation Lookahead based on probability
1e33d372
update peer finished condition
f1d92b19
Merge branch 'huggingface:main' into SL
3a252122
add support to do_sample True
21ab0247
add stopping criteria
e7610f89
gitignore
a0b107d9
Merge branch 'main' into SL
6f15efa0
add print
adf35984
remove prints
39b9f63e
minor
bdda459c
minor
1916bcd6
git ignore
6fea2b87
Merge branch 'main' into SL
00e3e798
adding test to stopping ConfidenceCriteria
7b0103d6
doc + format
7d4a0959
add doc
1e6a0e0b
gante
commented
on 2024-09-05
Update .gitignore
7a005d21
update docstring and default value of assistant_confidence_threshold
201741bb
add docstring
7c90a8a5
gante
approved these changes
on 2024-09-10
Update src/transformers/generation/configuration_utils.py
f457553f
style fix
cd71a924
jmamou
reopened this 1 year ago
jmamou
deleted the SL branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub