SPLIT PR: add_prefix_space fix #31315
itazap
force pushed
to
1dd79be0
1 year ago
itazap
marked this pull request as ready for review 1 year ago
itazap
force pushed
1 year ago
SPLIT PR: add_prefix_space fix
6c429b21
fix if case
ab3fdccd
missed case
0d53ff10
keep kwargs
458b2422
bytelevel handling
acf739c1
fix merge
50ec1225
case fix
014d3961
fix cases
aeca3f1a
remove force_froms_slow
2e2db921
consider bytelevel
19ca2af5
update cases for pretokenizer updating
328c12b6
adding no sentencepiece test
9b8257ed
ruff llama
0f1c5433
update_test
b9fd8825
fix sentencepiece case
b0d4456a
add sentencepiece req
584f899d
remove test:(
406ccde2
ruff
f8f8f388
ruff
c36e6168
remove t5 copy from semaless m4
fafe5b79
set pretokenizer if unset
610c319b
fix case for pretokenizer
3e645791
added llama and t5 tests for legacy prefix space in fast tokenizer
2e054a7a
modify tokenization test to test without sentencepiece
ebf781ec
itazap
force pushed
to
ebf781ec
1 year ago
removing unused imports
61fd1023
Trigger CI attempt
20dc2bb5
Trigger CI attempt
83af87b5
reverting sentencepiece test
d2990afe
update tests to only test legacy=False
d50bbc8a
applying feedback to simplify pretokenizer and normalizer updates, + …
f48f1b46
itazap
force pushed
to
f48f1b46
1 year ago
remove commont test and ruff
632a21cb
updating feedback 2
80ba03ed
fix test and ruff
423bce29
fixing test
78bccc63
add test tokenizer without sentencepiece for huggyllama
1de64574
itazap
reopened this 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub