tests : add test-tokenizer-0.sh #7036
tests : add test-tokenizer-0.sh
ce7d3a04
ggerganov
force pushed
to
ce7d3a04
1 year ago
unicode : add all unicode number ranges
7053b261
starcoder : fix pre-tokenizer
cf00fe1e
tests : add test that fails with DeepSeek tokenizers
3a461dbf
falcon : fix regex
3275e60f
unicode : regenerate unicode tables
cd7c728a
refact : add tokenizer model
d53240cc
lint : fix
c30056a7
tests : disable failing tests
bc26eb75
refact : add tests files
9745cf88
Merge branch 'master' into gg/add-tokenizer-test-script
26f606ef
convert : print -> logging
d974aed5
lint : fix
5f30e30a
ggerganov
force pushed
to
5f30e30a
1 year ago
unicode : digit -> number
f19b45cb
phi-3 : update
7e11d409
ggerganov
merged
92139b90
into master 1 year ago
ggerganov
deleted the gg/add-tokenizer-test-script branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub