[`Core tokenization`] `add_dummy_prefix_space` option to help with latest issues #28010
add add_dummy_prefix_space option to slow
aafce553
Merge branch 'main' of github.com:huggingface/transformers into add-p…
f72cf3de
checking kwargs might be better. Should be there for all spm tokenize…
1175230c
nits
9c2060db
fix copies
5e649acb
more copied
59fb5f47
Merge branch 'main' of github.com:huggingface/transformers into add-p…
8ff0f899
nits
4757eb5b
add prefix space
3fc1a787
nit
ac77de36
Merge branch 'main' of github.com:huggingface/transformers into add-p…
aa4a7bd8
Merge branch 'main' of github.com:huggingface/transformers into add-p…
b57ec75e
nits
16750dd0
ArthurZucker
marked this pull request as ready for review 2 years ago
Update src/transformers/convert_slow_tokenizer.py
0fa9ce3f
fix inti
e3a631f7
revert wrong styling
b8459407
Merge branch 'main' of github.com:huggingface/transformers into add-p…
af9d6d03
fix
32ae37ee
nits
041307ff
style
f5fb07fa
DNGros
commented
on 2024-02-16
Merge branch 'main' of github.com:huggingface/transformers into add-p…
59818018
updates
0f6aff23
make sure we use slow tokenizer for conversion instead of looking for…
c0efd774
support llama ast well
f69a6a9c
update llama tokenizer fast
41266647
nits
22b37abc
nits nits nits
6768ba4c
update the doc
37b36fe6
update
1e615383
update to fix tests
a870c8af
skip unrelated tailing test
ab9cf26c
Merge branch 'main' of github.com:huggingface/transformers into add-p…
e4c17e8a
Update src/transformers/convert_slow_tokenizer.py
3e9d0a21
add proper testing
601e873b
Merge branch 'add-prefix-space' of github.com:huggingface/transformer…
d9be85da
test decode as well
5e475c27
more testing
282fac4f
format
52c57728
fix llama test
f79be09c
Apply suggestions from code review
3220c303
ArthurZucker
deleted the add-prefix-space branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub