llama.cpp
fbca2f27 - Add support for ArcticForCausalLM (#7020)

Commit

2 years ago

Add support for ArcticForCausalLM (#7020) * common : increase max number of experts to 128 * common : add tensor LLM_TENSOR_FFN_NORM_EXPS for normalization before MoE that runs in parallel to attention + ffn * gguf-py : add architecture-specific block mappings that override selected general block mappings * convert-hf : add model conversion support for ArcticForCausalLM * convert-hf : use added_tokens_decoder from tokenizer_config.json to redefine tokens from SentencePiece model (only for ArcticForCausalLM) * llama : add inference support for LLM_ARCH_ARCTIC --------- Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>

References

#7020 - Added support for the ArcticForCausalLM.

Author

fairydreaming

Parents

0df0aa8e

llama.cpp fbca2f27 - Add support for ArcticForCausalLM (#7020)

llama.cpp
fbca2f27 - Add support for ArcticForCausalLM (#7020)