PR #14898 Add support for SmallThinker model series

support smallthinker

wdl339 committed 1 year ago

support 20b softmax, 4b no sliding window

wdl339 committed 350 days ago

Merge branch 'master' into smallthinker

wdl339 committed 350 days ago

new build_moe_ffn_from_probs, and can run 4b

wdl339 committed 350 days ago

fix 4b rope bug

wdl339 committed 350 days ago

Merge branch 'master' into smallthinker

wdl339 committed 348 days ago

fix python type check

wdl339 committed 348 days ago

remove is_moe judge

wdl339 committed 347 days ago

remove set_dense_start_swa_pattern function and modify set_swa_pattern function

wdl339 committed 347 days ago

trim trailing whitespace

wdl339 committed 347 days ago

remove get_vocab_base of SmallThinkerModel in convert_hf_to_gguf.py

wdl339 committed 347 days ago

better whitespace

wdl339 committed 347 days ago

use GGML_ASSERT for expert count validation

wdl339 committed 347 days ago

Improve null pointer check for probs

wdl339 committed 347 days ago

use template parameter for SWA attention logic

wdl339 committed 347 days ago

better whitespace

wdl339 committed 347 days ago

move the creation of inp_out_ids before the layer loop

wdl339 committed 347 days ago

remove redundant judge for probs

wdl339 committed 346 days ago

llama.cpp Add support for SmallThinker model series #14898 Merged