whisper.cpp
62ba8b53
- CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.) (llama/19126)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
3 days ago
CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.) (llama/19126)
References
#3636 - sync : ggml
Author
am17an
Committer
ggerganov
Parents
f0e85bb1
Loading