llama.cpp
llama : support RWKV v6 models
#8980
Merged

llama : support RWKV v6 models #8980

ggerganov merged 53 commits into ggml-org:master from MollySophia:for-upstream
MollySophia
github-actions github-actions added python
github-actions github-actions added ggml
compilade compilade requested a review from compilade compilade 1 year ago
MollySophia MollySophia force pushed 1 year ago
compilade
compilade commented on 2024-08-11
MollySophia MollySophia force pushed 1 year ago
MollySophia MollySophia force pushed 1 year ago
Ronsor
Ronsor commented on 2024-08-11
Ronsor
Ronsor commented on 2024-08-11
Ronsor
Ronsor commented on 2024-08-11
compilade
compilade commented on 2024-08-11
MollySophia MollySophia force pushed 1 year ago
compilade
compilade commented on 2024-08-12
MollySophia MollySophia force pushed 1 year ago
MollySophia MollySophia force pushed 1 year ago
MollySophia MollySophia force pushed 1 year ago
MollySophia
MollySophia commented on 2024-08-13
MollySophia MollySophia force pushed 1 year ago
MollySophia MollySophia force pushed 1 year ago
MollySophia
compilade
compilade commented on 2024-08-25
MollySophia
MollySophia MollySophia force pushed 1 year ago
MollySophia MollySophia force pushed 1 year ago
compilade
compilade commented on 2024-08-25
MollySophia convert_hf_to_gguf: Add support for RWKV v6
8d2eca35
LaylBongers Add RWKV tokenization
dc0767f4
MollySophia Fix build
865167d0
LaylBongers Do not use special tokens when matching in RWKV tokenizer
7cac72a8
LaylBongers Fix model loading
e92c74f4
LaylBongers Add (broken) placeholder graph builder for RWKV
a0aae8d6
LaylBongers Add workaround for kv cache
a8667896
LaylBongers Add logits conversion to rwkv5
4e23d971
LaylBongers Add rwkv5 layer norms
54795885
LaylBongers Add time mix KVRG & correct merge mistake
dd3aa3d4
LaylBongers Add remaining time mix parameters
b409fd8e
LaylBongers Add time mix output loading
3cbeffc5
LaylBongers Add placeholder llm_build_time_mix
b3b17e05
MollySophia Fix build
700dad1b
MollySophia Load more tensors for rwkv v6
a180b63b
MollySophia Fix rwkv tokenizer
0e5ac349
MollySophia ggml: Add unary operator Exp
5732de89
MollySophia RWKV v6 graph building
0784a0cf
MollySophia Add ``rescale_every_n_layers`` parameter
8d498c70
MollySophia Add ``wkv.head_size`` key for RWKV
903089b5
MollySophia Fix offloading layers to CUDA
98ce5f43
MollySophia Fix parallel inferencing for RWKV
01dcf4bb
MollySophia Remove trailing whitespaces
6ae2f486
MollySophia build_rwkv: Avoid using inplace operations
8bc1f9ae
MollySophia convert_hf_to_gguf: rwkv: Avoid using ``eval``
18decea3
MollySophia convert_hf_to_gguf: rwkv tokenizer: Don't escape sequences manually
7f2e370f
MollySophia Update convert_hf_to_gguf.py
c6955525
MollySophia ggml: Add backward computation for unary op ``exp``
8aa711ad
MollySophia Update convert_hf_to_gguf.py
ae9936a8
MollySophia Update convert_hf_to_gguf.py
5afa3eff
MollySophia Use MODEL_ARCH.RWKV6 instead of MODEL_ARCH.RWKV
12fbe1ad
MollySophia build_rwkv6: Simplify graph
276d53b1
MollySophia llama: rwkv6: Detect model.type
b0f4fe52
MollySophia llama: rwkv6: Fix tensor loading for 7B/14B models
683d70cb
MollySophia llama: rwkv6: Fix group_norm assertion failure with Metal
ee1b78c0
MollySophia llama: rwkv6: Clean up
c165e346
MollySophia llama: rwkv6: Add quantization tensor exclusion
6da6aa48
MollySophia llama: rwkv6: Use the new advanced batch splits
f5d955d2
MollySophia Update src/llama.cpp
57decb4a
MollySophia llama: rwkv6: Use ``ggml_norm`` instead of ``ggml_group_norm``
e94778ad
MollySophia llama: rwkv6: Apply code style and misc changes
7756afd8
MollySophia converter: Use class name ``Rwkv6Model``
87a29014
MollySophia llama: rwkv6: Make use of key ``feed_forward_length``
c414a24a
MollySophia llama: rwkv6: Add kv ``time_mix_extra_dim`` and ``time_decay_extra_dim``
6d69fd77
MollySophia converter: Match ``new_name`` instead of ``name`` for float32 explici…
601b5920
MollySophia llama: rwkv6: Keep ``time_mix_w1/w2`` as F32
e0ea5114
MollySophia llama: rwkv6: Remove unused nodes
5f00c52b
MollySophia llama: rwkv6: Apply code format changes
7444046c
MollySophia MollySophia force pushed to 7444046c 1 year ago
MollySophia llama: rwkv6: Add lora for some supported tensors
7f2ef566
ggerganov
MollySophia
ggerganov rwkv : speed-up tokenization using trie
7004323e
ggerganov
ggerganov minor : style + indentation
59dc2e70
ggerganov
ggerganov approved these changes on 2024-08-30
ggerganov ggerganov requested a review from compilade compilade 1 year ago
compilade
compilade approved these changes on 2024-08-30
MollySophia llama: rwkv6: Avoid division by zero
51753757
MollySophia ggml: rwkv_wkv: Avoid copying the state
846358d3
ggerganov ggerganov merged 8f1d81a0 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone