llama.cpp
Various script cleanups/fixes + convert merges and special token handling
#2842
Merged

Commits
  • convert: Fix permute calls and method/func definitions
    KerfuffleV2 committed 2 years ago
  • Cleanups for gguf-py
    KerfuffleV2 committed 2 years ago
  • Minor types cleanups.
    KerfuffleV2 committed 2 years ago
  • Initial implementation of handling merges and special tokens
    KerfuffleV2 committed 2 years ago
  • convert: Handle special tokens and merges in vocab only mode
    KerfuffleV2 committed 2 years ago
  • gguf: Refactor tensor name mapping
    KerfuffleV2 committed 2 years ago
  • convert: Fix type hint for special_token_types in SpecialVocab
    KerfuffleV2 committed 2 years ago
  • Use common special vocab handling in various conversion scripts
    KerfuffleV2 committed 2 years ago
  • First pass at implementing suggested changes
    KerfuffleV2 committed 2 years ago
  • Second pass
    KerfuffleV2 committed 2 years ago
  • gguf: SpecialVocab: Fix issue with special token content not in a dict
    KerfuffleV2 committed 2 years ago
  • convert-falcon-hf-to-gguf: Support --vocab-only option, bail out if no tokenizer.json
    KerfuffleV2 committed 2 years ago
  • convert-gptneox-hf-to-gguf and convert: Only handle merges for BPE tokenizer
    KerfuffleV2 committed 2 years ago
  • gguf: SpecialVocab: Actually set load_merges in object
    KerfuffleV2 committed 2 years ago
  • Uniform args parsing and vocab only mode for convert examples
    KerfuffleV2 committed 2 years ago
  • convert.py: Set gpt2 as tokenizer model when using BPE
    KerfuffleV2 committed 2 years ago
  • Squish last type warning in gguf.py - yay!
    KerfuffleV2 committed 2 years ago
Loading