openvino_tokenizers
f1f171c6 - Add Unigram Tokenizer Implementation (#431)

Commit
347 days ago
Add Unigram Tokenizer Implementation (#431) * Fix Shape Estimation in RegexSplit * Fix Shape Estimation in RegexSplit * Add Unigram Tokenizer Support - Add UnigramTokenizer operation - Change the default conversion behaviour for Fast tokenizer to use new Unigram implementation instead of the Sentencepiece backend - Add support for Strip normalization operation - Separate Sentencepiece backend tests from our implementation of BPE and Unigram * Ruff Check/Format * Update Tests * Update Tests * Fix Split Parsing * Fix Pass Rate
Author
Parents
Loading