llama.cpp
Add support for QRWKV6 hybrid models & slight optimization for RWKV6
#11001
Merged

Add support for QRWKV6 hybrid models & slight optimization for RWKV6 #11001

MollySophia merged 12 commits into ggml-org:master from MollySophia:rwkv6qwen2
MollySophia
github-actions github-actions added Nvidia GPU
github-actions github-actions added Vulkan
github-actions github-actions added python
github-actions github-actions added ggml
github-actions github-actions added SYCL
github-actions github-actions added testing
MollySophia WIP: Add support for RWKV6Qwen2
f298f039
MollySophia RWKV: Some graph simplification
385b611d
MollySophia Add support for RWKV6Qwen2 with cpu and cuda GLA
fab0aa7b
MollySophia RWKV6[QWEN2]: Concat lerp weights together to reduce cpu overhead
bc930cd5
MollySophia Fix some typos
f2c1a5c9
MollySophia code format changes
aaa870e8
MollySophia Fix wkv test & add gla test
00930e6f
MollySophia Fix cuda warning
08cf5606
MollySophia Update README.md
331581b2
MollySophia MollySophia force pushed to 331581b2 1 year ago
MollySophia
ggerganov
ggerganov approved these changes on 2025-01-07
ggerganov ggerganov requested a review from compilade compilade 1 year ago
MollySophia Update ggml/src/ggml-cuda/gla.cu
aed0afb4
MollySophia Fix fused lerp weights loading with RWKV6
d8a304c2
MollySophia
compilade
compilade commented on 2025-01-07
MollySophia better sanity check skipping for QRWKV6 in llama-quant
324afba5
compilade
compilade approved these changes on 2025-01-10
MollySophia MollySophia merged ee7136c6 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone