llama.cpp
llama : initial Mamba-2 support
#9126
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
44
Changes
View On
GitHub
llama : initial Mamba-2 support
#9126
compilade
merged 44 commits into
master
from
compilade/mamba2
compilade
marked this pull request as draft
1 year ago
github-actions
added
python
github-actions
added
ggml
llama : initial Mamba-2 support
1f0fea70
ggml : SIMD ggml_ssm_scan for Mamba-2
dceff23f
llama : support running Mamba-Codestral-7B-v0.1
2bfe9de6
llama : fix Mamba-2 conv state saving
aff96920
compilade
force pushed
from
e9b0d198
to
aff96920
1 year ago
compilade
changed the base branch from
compilade/batch-splits
to
master
1 year ago
compilade
marked this pull request as ready for review
1 year ago
compilade
added
Review Complexity : Medium
llama : remove unused variable
e04910dc
llama : add missing break
fa358e70
convert_hf : prefer SentencePiece tokenizer for Mamba-2 when present
38913dc8
Vaibhavs10
commented on 2024-08-23
Merge branch 'master' into compilade/mamba2
0e601caf
llama : avoid redundant state copy for Mamba 1 and 2
273e7a49
Merge branch 'master' into compilade/mamba2
7d6cb368
github-actions
added
testing
metal : attempt to adapt SSM_SCAN for Mamba-2
2c77d799
metal : fix SSM_SCAN pipeline scope
87b97d08
ggerganov
commented on 2024-10-02
metal : use log and exp instead of log1pf and expf in SSM_SCAN
03d0e6ea
metal : remove unused arguments for SSM_SCAN
7a351abc
metal : add back n_seqs to SSM_SCAN args
8b15bc6f
metal : fix SSM_SCAN state head offset
5b8ec2b9
metal : fix wrong number of tokens per sequence in SSM_SCAN
62b09b34
Merge branch 'master' into compilade/mamba2
038d9583
ggml : remove unused fast broadcast path in GGML_MUL
805512a7
Merge branch 'master' into compilade/mamba2
7d16e1bc
ggml : avoid multiply by D in GGML_OP_SSM_SCAN
3bc7103d
Merge branch 'master' into compilade/mamba2
8d8f0657
convert : fix flake8 lint
b4e9c599
Merge branch 'master' into compilade/mamba2
1ee6c482
Merge branch 'master' into compilade/mamba2
c9ecf620
github-actions
added
Apple Metal
Merge branch 'master' into compilade/mamba2
35d06fac
metal : fix confusion between ; and ,
cf4f0a41
metal : add missing args for nb references in ssm_scan_f32_group
6def5cd7
metal : single-user mamba2 inference works
791998b4
kv-cache : remove const_cast when setting inputs for s_copy
94c3d530
Merge branch 'master' into compilade/mamba2
929fe85d
convert : avoid AutoConfig for Mamba and Mamba2 hparams
d55b0d06
kv-cache : allow context shift for recurrent models
e94f3932
Merge branch 'master' into compilade/mamba2
9864bfcd
graph : fix recurrent state copies when avoiding copies
2fa5f2ce
ggml : fix mamba2 ssm scan when compiled with SVE
757aa623
ggml-cpu : reorder SVE FMA for consistency with other SIMD arches
0b6f6bec
Merge branch 'master' into compilade/mamba2
a42f2394
cuda : implement ssm scan for Mamba2
f8c7caee
github-actions
added
Nvidia GPU
Merge branch 'master' into compilade/mamba2
830e5542
Merge branch 'master' into compilade/mamba2
afdb6692
mamba : fix mismatched new and delete size for llm_build_mamba
dc1d109d
ggerganov
approved these changes on 2025-07-01
Merge branch 'master' into compilade/mamba2
73de1fd1
cuda : graceful fallback for Mamba-1 models with weird embd size
71bef665
compilade
merged
5d46babd
into master
80 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
Vaibhavs10
Assignees
No one assigned
Labels
testing
Nvidia GPU
python
Review Complexity : Medium
ggml
Apple Metal
Milestone
No milestone
Login to write a write a comment.
Login via GitHub