PR #5423 Add support for BERT embedding models

BERT WIP

cebtenzzre committed 2 years ago

merge from master

iamlemec committed 2 years ago

it runs; tokenization is messed up; pooling is wrong for multi batches

iamlemec committed 2 years ago

add in wordpiece tokenizer

iamlemec committed 2 years ago

put causal_attn flag in gguf

iamlemec committed 2 years ago

Merge remote-tracking branch 'origin/master' into bert

iamlemec committed 2 years ago

Merge remote-tracking branch 'upstream/master' into bert

iamlemec committed 2 years ago

Update convert-hf-to-gguf.py

iamlemec committed 2 years ago

add causal attention gguf key

iamlemec committed 2 years ago

use ctx_output for tok_norm of BERT and BLOOM

cebtenzzre committed 2 years ago

bert : add some missing graph callbacks

cebtenzzre committed 2 years ago

fix up model sizing and result acquisition

iamlemec committed 2 years ago

hard-code token_type = 0

iamlemec committed 2 years ago

Merge branch 'bert' of github.com:iamlemec/llama.cpp into bert

iamlemec committed 2 years ago

style fixes

iamlemec committed 2 years ago

undo attempted type_embd simplify

iamlemec committed 2 years ago

bert : simplify token type embedding access

cebtenzzre committed 2 years ago

flake8 : add W503 to ignore list

cebtenzzre committed 2 years ago

minor : code style normalization

ggerganov committed 2 years ago

avoid use of ggml_graph_get_tensor

iamlemec committed 2 years ago

Merge branch 'bert' of github.com:iamlemec/llama.cpp into bert

iamlemec committed 2 years ago

llama.cpp Add support for BERT embedding models #5423 Merged