transformers
[`Core tokenization`] `add_dummy_prefix_space` option to help with latest issues
#28010
Merged

[`Core tokenization`] `add_dummy_prefix_space` option to help with latest issues #28010

ArthurZucker merged 40 commits into main from add-prefix-space
ArthurZucker
ArthurZucker add add_dummy_prefix_space option to slow
aafce553
huggingface huggingface deleted a comment from github-actions on 2024-01-15
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-p…
f72cf3de
ArthurZucker checking kwargs might be better. Should be there for all spm tokenize…
1175230c
HuggingFaceDocBuilderDev
ArthurZucker nits
9c2060db
ArthurZucker fix copies
5e649acb
ArthurZucker more copied
59fb5f47
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-p…
8ff0f899
ArthurZucker nits
4757eb5b
ArthurZucker add prefix space
3fc1a787
ArthurZucker nit
ac77de36
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-p…
aa4a7bd8
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-p…
b57ec75e
ArthurZucker nits
16750dd0
ArthurZucker ArthurZucker marked this pull request as ready for review 2 years ago
ArthurZucker
ArthurZucker commented on 2024-01-18
ArthurZucker Update src/transformers/convert_slow_tokenizer.py
0fa9ce3f
ArthurZucker fix inti
e3a631f7
ArthurZucker revert wrong styling
b8459407
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-p…
af9d6d03
ArthurZucker fix
32ae37ee
ArthurZucker nits
041307ff
ArthurZucker style
f5fb07fa
gabegrand
haileyschoelkopf
huggingface huggingface deleted a comment from github-actions on 2024-02-15
ArthurZucker
DNGros
DNGros commented on 2024-02-16
claudiosv
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-p…
59818018
ArthurZucker updates
0f6aff23
ArthurZucker make sure we use slow tokenizer for conversion instead of looking for…
c0efd774
ArthurZucker support llama ast well
f69a6a9c
ArthurZucker update llama tokenizer fast
41266647
ArthurZucker nits
22b37abc
ArthurZucker nits nits nits
6768ba4c
ArthurZucker update the doc
37b36fe6
ArthurZucker update
1e615383
ArthurZucker update to fix tests
a870c8af
ArthurZucker skip unrelated tailing test
ab9cf26c
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-p…
e4c17e8a
ArthurZucker Update src/transformers/convert_slow_tokenizer.py
3e9d0a21
ArthurZucker add proper testing
601e873b
ArthurZucker Merge branch 'add-prefix-space' of github.com:huggingface/transformer…
d9be85da
ArthurZucker test decode as well
5e475c27
ArthurZucker more testing
282fac4f
ArthurZucker format
52c57728
ArthurZucker fix llama test
f79be09c
LysandreJik LysandreJik requested a review from LysandreJik LysandreJik 2 years ago
ArthurZucker
ArthurZucker commented on 2024-02-20
ArthurZucker Apply suggestions from code review
3220c303
ArthurZucker
LysandreJik
LysandreJik approved these changes on 2024-02-20
LysandreJik
LysandreJik commented on 2024-02-20
ArthurZucker ArthurZucker merged 15cfe389 into main 2 years ago
ArthurZucker ArthurZucker deleted the add-prefix-space branch 2 years ago
casper-hansen

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone