feat: support StarCoder model architectures #3187
add placeholder of starcoder in gguf / llama.cpp
0c5d4d87
support convert starcoder weights to gguf
eb7f0eba
convert MQA to MHA
76d32cca
fix ffn_down name
7e0a843b
add LLM_ARCH_STARCODER to llama.cpp
7298c37e
set head_count_kv = 1
166a259f
load starcoder weight
57f064d7
add max_position_embeddings
a17ef397
set n_positions to max_positioin_embeddings
26836119
properly load all starcoder params
77c7ec17
fix head count kv
0be15e16
fix comments
dac31da4
fix vram calculation for starcoder
4420cff6
store mqa directly
ab13d071
add input embeddings handling
8bc76a22
add TBD
101c5787
working in cpu, metal buggy
a1cf66ea
cleanup useless code
6c353dc7
metal : fix out-of-bounds access in soft_max kernels
f82328ab
llama : make starcoder graph build more consistent with others
92a4f868
Merge pull request #2 from ggerganov/support-starcoder-fix
caa72209
refactor: cleanup comments a bit
57eaa39c
add other starcoder models: 3B, 7B, 15B
5ca037b9
wsxiaoys
marked this pull request as ready for review 1 year ago
support-mqa-directly
08f35c46
Merge pull request #3 from TabbyML/support-starcoder-mqa
e1fa9dd2
fix: remove max_position_embeddings, use n_train_ctx
f989ba15
Update llama.cpp
bb9931cf
ggerganov
approved these changes
on 2023-09-15
Update llama.cpp
eafcc34f
Apply suggestions from code review
e30ad714
fix: switch to space from tab
72a72854
ggerganov
merged
4fe09dfe
into master 1 year ago
wsxiaoys
deleted the support-starcoder branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub