llama.cpp
b70d2510
- CUDA: add gqa_ratio 4 for GLM 4.7 flash (#18953)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
48 days ago
CUDA: add gqa_ratio 4 for GLM 4.7 flash (#18953)
References
#18953 - CUDA: add gqa_ratio 4 for GLM 4.7 flash
Author
am17an
Parents
5516b9c1
Loading