llama.cpp
Various script cleanups/fixes + convert merges and special token handling
#2842

Merged

Commits

convert: Fix permute calls and method/func definitions

KerfuffleV2 committed 2 years ago
Cleanups for gguf-py

KerfuffleV2 committed 2 years ago
Minor types cleanups.

KerfuffleV2 committed 2 years ago
Initial implementation of handling merges and special tokens

KerfuffleV2 committed 2 years ago
convert: Handle special tokens and merges in vocab only mode

KerfuffleV2 committed 2 years ago
gguf: Refactor tensor name mapping

KerfuffleV2 committed 2 years ago
convert: Fix type hint for special_token_types in SpecialVocab

KerfuffleV2 committed 2 years ago
Use common special vocab handling in various conversion scripts

KerfuffleV2 committed 2 years ago
First pass at implementing suggested changes

KerfuffleV2 committed 2 years ago
Second pass

KerfuffleV2 committed 2 years ago
gguf: SpecialVocab: Fix issue with special token content not in a dict

KerfuffleV2 committed 2 years ago
convert-falcon-hf-to-gguf: Support --vocab-only option, bail out if no tokenizer.json

KerfuffleV2 committed 2 years ago
convert-gptneox-hf-to-gguf and convert: Only handle merges for BPE tokenizer

KerfuffleV2 committed 2 years ago
gguf: SpecialVocab: Actually set load_merges in object

KerfuffleV2 committed 2 years ago
Uniform args parsing and vocab only mode for convert examples

KerfuffleV2 committed 2 years ago
convert.py: Set gpt2 as tokenizer model when using BPE

KerfuffleV2 committed 2 years ago
Squish last type warning in gguf.py - yay!

KerfuffleV2 committed 2 years ago