llama.cpp
model : add PLaMo-2 model
#14560
Merged

model : add PLaMo-2 model #14560

CISC merged 80 commits into ggml-org:master from mitmul:mitmul/add-plamo2
mitmul
compilade wip: llama : separate recurrent states from the KV cache
271104c6
compilade llama : use std::find for seq_nodes in llama_rs_cache
8db1e4d4
compilade llama : state checkpoints for recurrent models
0028010d
compilade llama : correctly handle more edge cases for the rs cache
0c8b3b20
compilade Merge branch 'master' into compilade/refactor-kv-cache
d66849f6
compilade llama : rename many llama_kv_cache_* functions
a09db95e
compilade Merge branch 'master' into compilade/refactor-kv-cache
c460ff1a
compilade llama : remove useless return value for some llama_cache_* functions
b6fafd17
compilade Merge branch 'master' into compilade/refactor-kv-cache
b7ec12eb
compilade Merge branch 'master' into compilade/refactor-kv-cache
3b57b55c
compilade llama : rethink recurrent state cell counts
7e13f19f
compilade llama : support Jamba
cbc743e6
compilade Merge branch 'master' into compilade/refactor-kv-cache
0fd13e94
compilade llama : fix BERT inference without KV cache
61a88a1d
compilade convert-hf : check for unprocessed Jamba experts
ea2e63e9
compilade convert-hf : support Mini-Jamba conversion
fc59407e
compilade llama : fix Jamba quantization sanity checks
181dadf2
compilade llama : sequence-length-aware batch splitting
3a414b0b
compilade Merge branch 'master' into compilade/refactor-kv-cache
4e4c41e5
compilade llama : use equal-sequence-length sub-batches for recurrent models
3587a949
compilade Merge branch 'master' into compilade/refactor-kv-cache
5d3c7b95
compilade llama : fix batch split output count for embeddings
72eea492
compilade llama : minimize swaps when reordering logits
18d1c140
compilade llama : fix edge case finding batch seq_id of split recurrent cell
61200ef2
compilade llama : avoid copies for simple batch splits
eb589d5e
compilade llama : use im2col and mul_mat to perform convolution for Mamba
8fb57ac0
compilade llama : fix .base() compilation error on Windows
17f6c1ef
compilade llama : allow doing the equivalent of SSM_CONV with SUM_ROWS and MUL
fee3c1d7
compilade Merge branch 'master' into compilade/refactor-kv-cache
6840ac0b
compilade llama : rename llama_cache to llama_past
372482df
compilade examples : replace llama_kv_cache_seq_* with llama_past_seq_*
43d8d4bf
compilade Merge branch 'master' into compilade/refactor-kv-cache
ff794f55
compilade mamba : fix non-contiguous usage of ggml_silu
33425a7e
compilade Merge branch 'master' into compilade/refactor-kv-cache
10c3c419
compilade Merge branch 'master' into compilade/refactor-kv-cache
9b38f8bf
compilade Merge branch 'master' into compilade/refactor-kv-cache
bc320ef6
compilade llama : session saving and reloading for hybrid models
fcb889cf
compilade Merge branch 'master' into compilade/refactor-kv-cache
a03e32a3
compilade convert_hf : fix Jamba conversion
9d3f44da
compilade llama : fix mixed signedness comparison
5f62db79
compilade llama : use unused n_embd_k_gqa in k_shift
375de5b1
compilade llama : begin renaming llama_past back to llama_kv_cache
4bb4b22a
compilade Merge branch 'master' into compilade/refactor-kv-cache
63ac36b2
compilade Merge branch 'master' into compilade/refactor-kv-cache
124c222f
compilade llama : remove implicit recurrent state rollbacks
8006f3b3
compilade Merge branch 'master' into compilade/refactor-kv-cache
691698e1
compilade llama : partially apply clang-format style
e3fe6120
compilade Merge branch 'master' into compilade/refactor-kv-cache
2bcaf64e
compilade convert : fix jamba conv1d shape squeezing
908e6559
compilade Merge branch 'master' into compilade/refactor-kv-cache
4682e21c
compilade graph : add back hybrid memory graph input
20f8e43e
compilade model : add Jamba to Mamba-specific hparams printing
07c252f0
github-actions github-actions added examples
github-actions github-actions added python
mitmul mitmul changed the title Mitmul/add plamo2 Add PLaMo-2 model 160 days ago
mitmul mitmul marked this pull request as ready for review 160 days ago
compilade Merge branch 'master' into compilade/refactor-kv-cache
f7163582
mitmul Add PLaMo-2 model using hybrid memory module
f6567128
mitmul Fix z shape
4728e42a
mitmul mitmul force pushed to 4728e42a 159 days ago
mitmul Add cmath to include from llama-vocab.h
6acaf3c5
mitmul Explicitly dequantize normalization weights before RoPE apply
7e4c5ecc
mitmul Revert unnecessary cast because the problem can be solved by excludin…
149b98c8
mitmul Use ATTN_K/Q_NORM for k,q weights to prevent quantization
77865202
mitmul mitmul changed the title Add PLaMo-2 model model : add PLaMo-2 model 159 days ago
compilade
compilade commented on 2025-07-07
mitmul Remove SSM_BCDT that is not used from anywhere
0424a76e
mitmul Do not duplicate embedding weights for output.weight
ea95a1da
ggerganov
ggerganov commented on 2025-07-09
whoreson
mitmul Fix tokenizer encoding problem for multibyte strings
2d76b21e
mitmul Merge remote-tracking branch 'upstream/master' into mitmul/add-plamo2
fccec6db
mitmul mitmul force pushed to fccec6db 157 days ago
mitmul
mitmul Merge branch 'master' into mitmul/add-plamo2
5231e4f7
CISC
CISC commented on 2025-07-11
mitmul Apply suggestion from @CISC
521c1e0f
mitmul Update src/llama-model.cpp
df95fced
mitmul Use LLM_FFN_SWIGLU instead of splitting ffn_gate and ffn_up
498b8b37
mitmul Remove unnecessary part for Grouped Query Attention
6afd3be0
mitmul Fix how to load special token id to gguf
34360ebe
mitmul Remove unused tensor mapping
71abd3ad
CISC
CISC commented on 2025-07-12
mitmul Update src/llama-model.cpp
fb2ae69a
mitmul Remove llama_vocab_plamo2 class and replace it with llm_tokenizer_pla…
eea696e4
mitmul mitmul force pushed to eea696e4 153 days ago
ggerganov
ggerganov approved these changes on 2025-07-14
ggerganov ggerganov requested a review from CISC CISC 153 days ago
mitmul Update src/llama-vocab.cpp
841ffc85
CISC
CISC requested changes on 2025-07-14
CISC
CISC
CISC commented on 2025-07-14
mitmul Update convert_hf_to_gguf.py
35d81889
mitmul Update src/llama-model.cpp
d134e7f6
mitmul Update src/llama-model.cpp
921e864d
mitmul Merge remote-tracking branch 'upstream/master' into mitmul/add-plamo2
f87ac1c9
CISC
CISC requested changes on 2025-07-15
mitmul Update convert_hf_to_gguf.py
7b0b2ead
mitmul Update convert_hf_to_gguf.py
b42f95d6
mitmul
CISC
CISC approved these changes on 2025-07-15
mitmul
CISC
mitmul Fix plamo2 tokenizer session to prevent multiple calls of build()
6921534f
mitmul
CISC CISC merged 68e37a61 into master 152 days ago
mitmul mitmul deleted the mitmul/add-plamo2 branch 152 days ago
CISC

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone