llama.cpp
CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.)
#19126
Merged

CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.) #19126

am17an merged 3 commits into ggml-org:master from am17an:topk-cuda-refactor
am17an
am17an CUDA: refactor topk-moe to enable more models (GLM, Nemotron etc.)
bcbe257b
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
am17an template bias
1ae43b9b
am17an am17an force pushed from 3245ced7 to 1ae43b9b 65 days ago
JohannesGaessler
JohannesGaessler approved these changes on 2026-01-27
am17an review: formatting
eeb9b04a
am17an am17an force pushed from 3ad63db7 to eeb9b04a 64 days ago
am17an am17an force pushed from 3ad63db7 to eeb9b04a 64 days ago
am17an am17an merged 3bcc9909 into master 63 days ago
am17an am17an deleted the topk-cuda-refactor branch 63 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone