llama.cpp
BERT tokenizer fixes
#6498
Merged

BERT tokenizer fixes #6498

cebtenzzre merged 9 commits into master from ceb/bert-tokenizer-fixes
cebtenzzre
cebtenzzre convert-hf-to-gguf : fix BERT abuse of LlamaHfVocab
748fc8ba
cebtenzzre llama : handle added special tokens like HF does
88035827
cebtenzzre Merge branch 'master' into ceb/bert-tokenizer-fixes
0d052cbe
cebtenzzre convert : fix Tensor type annotations
6a9d3c09
cebtenzzre convert scripts : fix python 3.8 compatibility
909f6be2
cebtenzzre convert : remove now-unused ignore_nonllama parameter
45983e3a
cebtenzzre spm : fix special_add_bos default
d1a1b614
cebtenzzre examples : rely on new behavior of add_special
92591c12
cebtenzzre speculative : more robust tokenizer comparison
a37696d4
cebtenzzre cebtenzzre requested a review from iamlemec iamlemec 1 year ago
cebtenzzre cebtenzzre requested a review from ggerganov ggerganov 1 year ago
iamlemec
iamlemec commented on 2024-04-05
iamlemec
iamlemec approved these changes on 2024-04-05
cebtenzzre
ggerganov
ggerganov approved these changes on 2024-04-08
cebtenzzre cebtenzzre merged 1b67731e into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone