llama.cpp
llama : initial Mamba-2 support
#9126
Merged

Commits
  • llama : initial Mamba-2 support
    compilade committed 1 year ago
  • ggml : SIMD ggml_ssm_scan for Mamba-2
    compilade committed 1 year ago
  • llama : support running Mamba-Codestral-7B-v0.1
    compilade committed 1 year ago
  • llama : fix Mamba-2 conv state saving
    compilade committed 1 year ago
  • llama : remove unused variable
    compilade committed 1 year ago
  • llama : add missing break
    compilade committed 1 year ago
  • convert_hf : prefer SentencePiece tokenizer for Mamba-2 when present
    compilade committed 1 year ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 1 year ago
  • llama : avoid redundant state copy for Mamba 1 and 2
    compilade committed 1 year ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 1 year ago
  • metal : attempt to adapt SSM_SCAN for Mamba-2
    compilade committed 1 year ago
  • metal : fix SSM_SCAN pipeline scope
    compilade committed 1 year ago
  • metal : use log and exp instead of log1pf and expf in SSM_SCAN
    compilade committed 1 year ago
  • metal : remove unused arguments for SSM_SCAN
    compilade committed 1 year ago
  • metal : add back n_seqs to SSM_SCAN args
    compilade committed 1 year ago
  • metal : fix SSM_SCAN state head offset
    compilade committed 1 year ago
  • metal : fix wrong number of tokens per sequence in SSM_SCAN
    compilade committed 1 year ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 1 year ago
  • ggml : remove unused fast broadcast path in GGML_MUL
    compilade committed 1 year ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 1 year ago
  • ggml : avoid multiply by D in GGML_OP_SSM_SCAN
    compilade committed 1 year ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 1 year ago
  • convert : fix flake8 lint
    compilade committed 1 year ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 349 days ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 255 days ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 191 days ago
  • metal : fix confusion between ; and ,
    compilade committed 191 days ago
  • metal : add missing args for nb references in ssm_scan_f32_group
    compilade committed 191 days ago
  • metal : single-user mamba2 inference works
    compilade committed 191 days ago
  • kv-cache : remove const_cast when setting inputs for s_copy
    compilade committed 191 days ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 191 days ago
  • convert : avoid AutoConfig for Mamba and Mamba2 hparams
    compilade committed 190 days ago
  • kv-cache : allow context shift for recurrent models
    compilade committed 190 days ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 151 days ago
  • graph : fix recurrent state copies when avoiding copies
    compilade committed 151 days ago
  • ggml : fix mamba2 ssm scan when compiled with SVE
    compilade committed 151 days ago
  • ggml-cpu : reorder SVE FMA for consistency with other SIMD arches
    compilade committed 151 days ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 143 days ago
  • cuda : implement ssm scan for Mamba2
    compilade committed 143 days ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 143 days ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 139 days ago
  • mamba : fix mismatched new and delete size for llm_build_mamba
    compilade committed 135 days ago
  • Merge branch 'master' into compilade/mamba2
    compilade committed 130 days ago
  • cuda : graceful fallback for Mamba-1 models with weird embd size
    compilade committed 130 days ago
Loading