Fix for fast tokenizers save_pretrained compatibility with Python. (#2933)
* Renamed file generate by tokenizers when calling save_pretrained to match python.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Added save_vocabulary tests.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Remove python quick and dirty fix for clean Rust impl.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Bump tokenizers dependency to 0.5.1
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* TransfoXLTokenizerFast uses a json vocabulary file + warning about incompatibility between Python and Rust
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Added some save_pretrained / from_pretrained unittests.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Update tokenizers to 0.5.2
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Quality and format.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* flake8
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Making sure there is really a bug in unittest
* Fix TransfoXL constructor vocab_file / pretrained_vocab_file mixin.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>