llama.cpp
8e52631d - model: Add Mimo v2.5 model support (#22493)

Commit
8 days ago
model: Add Mimo v2.5 model support (#22493) * add mimo-v2.5 support * mimo-v2.5: fix modify_tensors row split * mimi-v2.5: forgot `add_attn_value_scale` plumbing * mimi-v2.5: fix tp dequant to detect tp rows * mimo-v2.5: fix TP iteration to be descending * mimo-v2.5: fix comment * mimo-v2.5: retain fused qkv * mimo-v2.5: missed the attn_value scale during merge * mimo-v2.5: fused QKV needs contiguous for scaling attention value * mimo-v2.5: move `speech_embeddings.` to TextModel filter_tensors * Update src/llama-hparams.h Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/models/mimo2.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/models/mimo2.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/models/mimo2.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * mimo-v2.5: include MTP weights in gguf --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Author
Parents
Loading