llama: implement YaRN RoPE scaling #2268
cebtenzzre
changed the title llama: implement NTK-By-Parts (NTKv2) llama: implement NTK-By-Parts (NTKv2) RoPE scaling 2 years ago
cebtenzzre
marked this pull request as ready for review 2 years ago
llama: implement NTK-By-Parts (NTKv2) RoPE scaling
8dec38c3
CUDA implementation
6aeb46b3
Metal implementation
9348aa4d
implement new YaRN algorithm
a30ae209
cebtenzzre
changed the title llama: implement NTK-By-Parts (NTKv2) RoPE scaling llama: implement YaRN RoPE scaling 2 years ago
cebtenzzre
marked this pull request as draft 2 years ago
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
b5ced4fb
ggml : increase GGML_MAX_OP_PARAMS
826269ad
YaRN : avoid NaN if unused betas are zero
cf731d56
YaRN : fix missing parameter in CUDA impl
dcb058ce
convert : reduce unnecessary variables in Params
281b26e6
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
a06c7292
llama : simplify use of context params
dc26a0dd
llama : store YaRN parameters in GGUF
904d4edf
fix convert scripts
56abb9a4
llama : fix C compatibility
43eaf06a
don't hardcode max_pos_emb
fe788c45
cebtenzzre
marked this pull request as ready for review 2 years ago
address review comments
e0b120c3
restore backwards compatiblity with *.rope.scale_linear
19bb74e7
better option descriptions in help
4d5fe734
gguf : store scaling type as a string instead of an int
74664157
improve printing of YaRN parameters
4f4e9480
allow forcing ext_factor to zero if scaling type is YaRN
5d7a3a5c
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
9bd050f1
fix rope_cuda parameter order
babf0e0c
default n_yarn_orig_ctx to n_ctx_train
0050e1ec
fix uninitialized cparams
09c31027
make printed param formatting more consistent
57c3442e
fix missing import
a20b3e6c
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
9ef91b13
Fix YaRN inverted scaling and add "rope.scaling.type" to GGUF (#1)
9ae10b3a
fix YaRN ramp, make mscale conditional, add --yarn-orig-ctx (#2)
14cf93b1
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
237f1e79
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
bc8395d5
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
4d5ed834
ggerganov
approved these changes
on 2023-10-28
fix loading rope.scaling.original_context_length from GGUF (#3)
9fc82382
implement YaRN for GPT-NeoX RoPE
15f26efd
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
081f7381
cebtenzzre
merged
898aeca9
into master 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub