llama.cpp
llm : add Falcon support
#2717
Merged

llm : add Falcon support #2717

ggerganov merged 38 commits into master from falcon
ggerganov
ggerganov llama : refactor GGUF constants into static maps
4ed3469c
ggerganov llama : check if model architecture is known
8bd7f06b
ggerganov llama : refactor llama_model_load_internal()
3057d6a6
ggerganov gguf : add KV constant maps
3c025a6d
ggerganov Merge branch 'master' into falcon
b19c6e46
ggerganov llm : read arch-specific KVs
9f28f737
ggerganov convert : add dummy scores + types
d1b3b95d
ggerganov falcon : load tensor data (CPU only)
2f3c80a8
ggerganov llama : fix loading progress bar
5c5413dc
ggerganov llama : add arch member to llama_model
085228e1
ggerganov falcon : CPU inference working
3c7c325b
klosax
ggerganov falcon : support non-40B models
2d58444d
klosax
ggerganov falcon : minor
0ec27ad6
ggerganov
klosax
ggerganov llama : minor updates
7bbbf38c
klosax convert-falcon-hf-to-gguf.py : fix special token mapping
9853f2cf
ggerganov ggerganov marked this pull request as ready for review 2 years ago
klosax llama.cpp : llama default UNK token = id 0
ffa5099c
klosax
ggerganov
goerch
klosax
goerch
klosax
klosax
slaren
goerch
klosax
klosax
slaren
klosax
klosax
klosax
slaren
goerch
klosax
klosax
slaren
klosax
goerch
klosax
klosax
klosax
klosax llama.cpp : fix bpe tokenizer
a95ae752
klosax llama.cpp : fix the fix of bpe tokenizer
d561b7f7
klosax
klosax
WilliamTambellini
lshzh-ww
ggerganov
ggerganov ggml : pass eps to ggml_norm
e3c52bd9
ggerganov metal : implement RoPE (mode = 2) + avoid ggml_repeat
99bb2607
ggerganov ggml : ggml_repeat always creates new tensor
af4bbcc8
ggerganov
ggerganov falcon : copy-paste self-attention from LLaMA
b34ab740
klosax
ggerganov metal : print extra compute pipeline info
a0dc47a5
klosax
ggerganov
klosax
ggerganov falcon : minor changes (still chasing the Metal problem)
e2d23bed
klosax
klosax llama.cpp : fix linefeed token
b693000c
ggerganov metal : fix GELU kernel numerical stability by using precise::tanh
0a85ae73
ggerganov
ggerganov metal : temporary workaround for the concurrency optimization bug
854ae5d0
slaren falcon : add CUDA offloading (#2739)
e7299656
ggerganov llama : better model naming and size reporting
176ea716
ggerganov
slaren
ggerganov
slaren
ggerganov
ggerganov Merge branch 'master' into falcon
6938c5f4
klosax
ggerganov llama : prep new tokenizer support
c3f8a6e4
ggerganov llama : advanced BPE tokenizer based on ggllm.cpp imlpementation
3bfb7206
ggerganov
ggerganov llama : remove oboslete comment
2424e1d0
ggerganov common : remove obsolete BPE API + disable test-tokenizer-1
596e1094
ggerganov llama : revert BPE special-case in llama_byte_to_token()
f8ee54bd
klosax
klosax
ggerganov cuda : add TODOs for RoPE NeoX implementation
8c6d3939
klosax
ggerganov llama : default special tokens based on vocab type
630d8b40
klosax
ggerganov
klosax
ggerganov perplexity : add log for start of tokenization
fae8faa1
ggerganov
ggerganov ggerganov merged cf658adc into master 2 years ago
klosax
ggerganov ggerganov deleted the falcon branch 2 years ago
klosax
ggerganov
klosax
cmp-nct
klosax
logicchains
Green-Sky
logicchains
Green-Sky
Green-Sky
logicchains
logicchains
logicchains
logicchains
Green-Sky
TheBloke
logicchains
Green-Sky
logicchains
TheBloke

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone