llama.cpp
b70d2510 - CUDA: add gqa_ratio 4 for GLM 4.7 flash (#18953)

Commit
48 days ago
CUDA: add gqa_ratio 4 for GLM 4.7 flash (#18953)
Author
Parents
Loading