llama.cpp
1466621e - llama : Support llama 4 text-only (#12791)

Comment changes are shownComment changes are hidden
Commit
112 days ago
llama : Support llama 4 text-only (#12791) * llama4 conversion * initial support, no chat template * clean up a bit * fix tokenizer conversion * correct hparams * try this * fix shexp * ffn_inp_normed * chat template * clean up model conversion * add_bos * add scale_before_ffn * fix order * weight_before_ffn * llm_graph_input_attn_temp * add chunk attn mask * build_inp_attn_scale() * add comment about ggml_repeat * clarify comments * fix build
Author
Parents
  • File
    convert_hf_to_gguf.py
  • File
    convert_hf_to_gguf_update.py
  • gguf-py/gguf
    • File
      constants.py
    • File
      gguf_writer.py
  • include
    • File
      llama.h
  • models
    • ggml-vocab-llama4.gguf.inp
    • ggml-vocab-llama4.gguf.out
  • src
    • File
      llama-arch.cpp
    • File
      llama-arch.h
    • File
      llama-chat.cpp
    • File
      llama-chat.h
    • File
      llama-graph.cpp
    • File
      llama-graph.h
    • File
      llama-hparams.h
    • File
      llama-model.cpp
    • File
      llama-model.h
    • File
      llama-vocab.cpp