llama.cpp
Add support for SmallThinker model series
#14898
Merged

Commits
  • support smallthinker
    wdl339 committed 276 days ago
  • support 20b softmax, 4b no sliding window
    wdl339 committed 260 days ago
  • Merge branch 'master' into smallthinker
    wdl339 committed 260 days ago
  • new build_moe_ffn_from_probs, and can run 4b
    wdl339 committed 260 days ago
  • fix 4b rope bug
    wdl339 committed 259 days ago
  • Merge branch 'master' into smallthinker
    wdl339 committed 257 days ago
  • fix python type check
    wdl339 committed 257 days ago
  • remove is_moe judge
    wdl339 committed 257 days ago
  • remove set_dense_start_swa_pattern function and modify set_swa_pattern function
    wdl339 committed 257 days ago
  • trim trailing whitespace
    wdl339 committed 257 days ago
  • remove get_vocab_base of SmallThinkerModel in convert_hf_to_gguf.py
    wdl339 committed 256 days ago
  • better whitespace
    wdl339 committed 256 days ago
  • use GGML_ASSERT for expert count validation
    wdl339 committed 256 days ago
  • Improve null pointer check for probs
    wdl339 committed 256 days ago
  • use template parameter for SWA attention logic
    wdl339 committed 256 days ago
  • better whitespace
    wdl339 committed 256 days ago
  • move the creation of inp_out_ids before the layer loop
    wdl339 committed 256 days ago
  • remove redundant judge for probs
    wdl339 committed 256 days ago
Loading