MPT support in llama.cpp #3417
CUDA: added support for ggml_clamp (see also: https://github.com/gger…
b49792b0
mpt : added an implementation based (mostly) on falcon integration, m…
15236e85
mpt : protect against "clip_qkv": null in mpt-7b
84e30e89
mpt : quick fix to avoid "Strange model" warning when quantizing MPT …
00e8c5c5
mpt : addendum to changeset:84e30e8 - leave parameter clamp_kqv out f…
1be89c40
mpt : standardized all tensor names to follow GGUF spec
26c253ed
mpt : addendum to changeset:1be89c40 - use "req" parameter of GGUF_GE…
df072d2d
mpt : fixed comment s/gptneox/mpt/
90e7d6de
mpt : remove tabs, trailing whitespace
47080129
ggerganov
approved these changes
on 2023-10-03
mpt : removed ne01 + n_past == ne00 assertion from alibi (cuda/f32) a…
1364bcd7
mpt : updated convert-mpt-hf-to-gguf.py to reflect changes made to co…
7d6a24aa
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
292363e5
cebtenzzre
force pushed
from
39fa4be3
to
292363e5
2 years ago
comment out n_past instead of marking it unused
ad3c2f3b
mpt : removed hardcoded +178 from convert script in favor of utilizin…
1a454eb5
mpt : remove unused tokenizer_json in convert script
32172f12
ggml : remove obsolete n_past assert in ggml_alibi
96cf3f5d
llama : print clam_kqv and max_alibi_bias hparams
9b66378c
ggerganov
merged
f5f9121d
into master 2 years ago
goerch
commented
on 2023-10-10
Assignees
No one assigned
Labels
high priority
model
Login to write a write a comment.
Login via GitHub