llama.cpp
model : add grok-2 support
#15539
Merged

Commits
  • add grok-2 support
    CISC committed 305 days ago
  • type fix
    CISC committed 305 days ago
  • type fix
    CISC committed 305 days ago
  • type fix
    CISC committed 305 days ago
  • "fix" vocab for invalid sequences
    CISC committed 305 days ago
  • fix expert tensor mapping and spaces in vocab
    CISC committed 305 days ago
  • add chat template
    CISC committed 305 days ago
  • fix norm tensor mapping
    CISC committed 304 days ago
  • rename layer_out_norm to ffn_post_norm
    CISC committed 304 days ago
  • ensure ffn_post_norm is mapped
    CISC committed 304 days ago
  • fix experts merging
    CISC committed 304 days ago
  • remove erroneous FFN_GATE entry
    CISC committed 304 days ago
  • concatenate split tensors and add more metadata
    CISC committed 303 days ago
  • process all expert layers and try cat instead of hstack
    CISC committed 302 days ago
  • add support for community BPE vocab
    CISC committed 302 days ago
  • fix expert feed forward length and ffn_down concat
    CISC committed 302 days ago
  • commit this too
    CISC committed 302 days ago
  • add ffn_up/gate/down, unsure if sequence is right
    CISC committed 301 days ago
  • add ffn_gate/down/up to tensor names
    CISC committed 301 days ago
  • correct residual moe (still not working)
    CISC committed 299 days ago
  • mess--
    CISC committed 299 days ago
  • fix embedding scale being applied twice
    CISC committed 297 days ago
  • add built in chat template
    CISC committed 297 days ago
  • change beta fast for grok if default value
    CISC committed 295 days ago
  • remove spm vocab in favor of community bpe vocab
    CISC committed 295 days ago
  • change attention temp length metadata type to integer
    CISC committed 295 days ago
  • update attention temp length metadata
    CISC committed 295 days ago
  • remove comment
    CISC committed 295 days ago
  • Merge branch 'master' into cisc/grok-2
    CISC committed 295 days ago
  • replace M_SQRT2 with std::sqrt(2)
    CISC committed 295 days ago
  • Merge branch 'master' into cisc/grok-2
    CISC committed 290 days ago
  • add yarn metadata, move defaults to hparams
    CISC committed 285 days ago
Loading