llama.cpp
3bcc9909
- CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.) (#19126)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
45 days ago
CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.) (#19126)
References
#19126 - CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.)
Author
am17an
Parents
d4964a7c
Loading