Adaptive dynamic number of speculative tokens (#34156)
* initial commit
* update strategy
* add tradeoff FPR TPR with cost
* all probs
* fix
* fix
* fix style
* Update src/transformers/generation/configuration_utils.py
shorter docstring
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* import guard
* fix style
* add is_sklearn_available condition
* vectorizing to flatten the for-loop
* fix style
* disable adaptation for UAG
* update doc
* add TestAssistedCandidateGeneratorUpdateStrategy
* fix style
* protect import
* fix style
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>