llama.cpp
MPT support in llama.cpp
#3417
Merged

Commits
  • CUDA: added support for ggml_clamp (see also: https://github.com/ggerganov/ggml/issues/545)
    jploski committed 2 years ago
  • mpt : added an implementation based (mostly) on falcon integration, modified with deltas from ggml/examples/mpt
    jploski committed 2 years ago
  • mpt : protect against "clip_qkv": null in mpt-7b
    jploski committed 2 years ago
  • mpt : quick fix to avoid "Strange model" warning when quantizing MPT models
    jploski committed 2 years ago
  • mpt : addendum to changeset:84e30e8 - leave parameter clamp_kqv out from metadata rather than use 0.0 to indicate "no clamping" (more compliant with the current GGUF spec?)
    jploski committed 2 years ago
  • mpt : standardized all tensor names to follow GGUF spec
    jploski committed 2 years ago
  • mpt : addendum to changeset:1be89c40 - use "req" parameter of GGUF_GET_KEY macro instead of duplicate code
    jploski committed 2 years ago
  • mpt : fixed comment s/gptneox/mpt/
    jploski committed 2 years ago
  • mpt : remove tabs, trailing whitespace
    jploski committed 2 years ago
  • mpt : removed ne01 + n_past == ne00 assertion from alibi (cuda/f32) and rope_shift from build_mpt
    jploski committed 2 years ago
  • mpt : updated convert-mpt-hf-to-gguf.py to reflect changes made to convert-gptneox-hf-to-gguf.py in pr:3252
    cebtenzzre committed 2 years ago
  • Merge branch 'master' of https://github.com/ggerganov/llama.cpp into pull-3417
    cebtenzzre committed 2 years ago
  • comment out n_past instead of marking it unused
    cebtenzzre committed 2 years ago
  • mpt : removed hardcoded +178 from convert script in favor of utilizing hparams["vocab_size"]
    jploski committed 2 years ago
  • mpt : remove unused tokenizer_json in convert script
    cebtenzzre committed 2 years ago
  • ggml : remove obsolete n_past assert in ggml_alibi
    ggerganov committed 2 years ago
  • llama : print clam_kqv and max_alibi_bias hparams
    ggerganov committed 2 years ago
Loading