Universal Assisted Generation: Assisted generation with any assistant model (by Intel Labs) (#33383)
* Update candidate_generator.py
* Update utils.py
* add lookbehind params to _get_candidate_generator
* make fixup
* add unit tests
* fix failing tests
* add docstrings
* fix docstrings; remove non-optimized AnyTokenizer
* added any tokenizer generation correctness test
* make fixup
* fix assertion syntax
* PR review fixes
* address additional PR comments
* fix tests
* remove stropping criteria arg
* make fixup
* add AssistantConfig
* fix prev_tokens branching
* pass tokenizers through `generate()`kwargs
* fix lookbehind values; tokenizer params WIP
* fixup
* AssistantConfig
* remove AssistantConfig; apply PR suggestions
* restructure tests
* fixup
* fix assistant_tokenizer arg validation
* fixup
* fix tests in TestAssistedCandidateGeneratorDifferentTokenizers
* fix class docstring
* PR suggestions
* doc
* doc update and improvements to `_validate_assistant()`
---------
Co-authored-by: mosheber <moshe.berchansky@intel.com>