llama.cpp
Add Unigram tokenizer needed by T5 and FLAN-T5 model families
#8089
Merged

Add Unigram tokenizer needed by T5 and FLAN-T5 model families #8089

fairydreaming
sszymczy llama : add T5 model architecture, tensors and model header parameters
c2c799ce
mofosyne mofosyne added Review Complexity : Medium
sszymczy llama : add handling of byte tokens in UGM tokenizer (same as in SPM)
f4c03c09
ggerganov
ggerganov approved these changes on 2024-06-25
sszymczy llama : replace allocated precompiled_charsmap buffer with std::vecto…
87b7dd23
fairydreaming Merge branch 'ggerganov:master' into t5-clean-2
21d36842
sszymczy llama : fix whitespace formatting
f23ff913
fairydreaming fairydreaming merged 6fcbf682 into master 1 year ago
ggerganov
ggerganov commented on 2024-07-02

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone