Add LRU cache, add faster tokenization #37
Update gpt2_tokenization.py
160e2bd5
Update gpt2_tokenization.py
e7c3d51c
Update gpt2_tokenization.py
dc800086
Update preprocess_data.py
a405b9ea
Update gpt2_tokenization.py
54ab4e37
Merge branch 'bigscience-workshop:main' into main
e729aba9
thomasw21
approved these changes
on 2021-08-04
Update megatron/tokenizer/gpt2_tokenization.py
35011493
huu4ontocord
changed the title Add LRU cache, add faster tokenization, and add optional Chinese tokenization. Add LRU cache, add faster tokenization 4 years ago
Update gpt2_tokenization.py
75cce0bc
stas00
approved these changes
on 2021-08-04
Update megatron/tokenizer/gpt2_tokenization.py
cc579250
Update gpt2_tokenization.py
18118923
Update gpt2_tokenization.py
02b2d2fb
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub