use `TokenizersBackend` (#42894)
* us `TokenizersBackend`
* fixes
* pioritize mapping
* pioritize mapping
* only use mapping for some models
* fix fallback
* undo debug thing
* add case to tokenizersbackend init
* add default bos eos token to tok backend
* set bos eos
* fix more models
* mistrla idefics
* fix stopping criteria test
* fix stopping criteria test
* try stopping criteria fix
* rebase
* update tokenizer model for stopping criteria test
* fix tuple mapping for ministral
* ignore `tokenizer_class` as it is always wrong
* up
* try to fix idefics
* fix unispeech and maybe other: fallback if conversion was not possible to the saveclass
* nits
* fixup
* TIL that it was ALSO saved in config.json...
* arf
* fallback to tok config if no config json
* people who map to Llama probably don't even want llama either..
* processors to load tokbackend
* auto fix order
* try diff order
* mistral fix for weird chars
* reorder
* random fix attempt for failing tests that are failing locally so idk how to check these
* trying an older commit
* fix mistral
* map unispeech
* try something out
* update
* nits
* trying to be a little bit more restrictive
* token type ids for tokenizers should be explicits... let's see which test fail this and we'll add to the specific classes?
* Nit
* idefics 1-2 are actually the only ones that should map to llama force
* small fixes
* fix layout
* fixup
* fix some tests
* 1 nit
* aria fix
* style
* canine
* fixup
* very small test
* style
* update to tokenizersbackend
---------
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-45.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-52.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-196.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-217.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-111.ec2.internal>
Co-authored-by: itazap <ita.zaporozhets@huggingface.co>
Co-authored-by: Ita Zaporozhets <31893021+itazap@users.noreply.github.com>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-75.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-100.ec2.internal>