llama.cpp
Add support for QRWKV6 hybrid models & slight optimization for RWKV6
#11001
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
12
Changes
View On
GitHub
Add support for QRWKV6 hybrid models & slight optimization for RWKV6
#11001
MollySophia
merged 12 commits into
ggml-org:master
from
MollySophia:rwkv6qwen2
github-actions
added
Nvidia GPU
github-actions
added
Vulkan
github-actions
added
python
github-actions
added
ggml
github-actions
added
SYCL
github-actions
added
testing
WIP: Add support for RWKV6Qwen2
f298f039
RWKV: Some graph simplification
385b611d
Add support for RWKV6Qwen2 with cpu and cuda GLA
fab0aa7b
RWKV6[QWEN2]: Concat lerp weights together to reduce cpu overhead
bc930cd5
Fix some typos
f2c1a5c9
code format changes
aaa870e8
Fix wkv test & add gla test
00930e6f
Fix cuda warning
08cf5606
Update README.md
331581b2
MollySophia
force pushed
to
331581b2
1 year ago
ggerganov
approved these changes on 2025-01-07
ggerganov
requested a review
from
compilade
1 year ago
Update ggml/src/ggml-cuda/gla.cu
aed0afb4
Fix fused lerp weights loading with RWKV6
d8a304c2
compilade
commented on 2025-01-07
better sanity check skipping for QRWKV6 in llama-quant
324afba5
compilade
approved these changes on 2025-01-10
MollySophia
merged
ee7136c6
into master
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
compilade
Assignees
No one assigned
Labels
testing
Nvidia GPU
Vulkan
python
ggml
SYCL
Milestone
No milestone
Login to write a write a comment.
Login via GitHub