CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.) #19126
CUDA: refactor topk-moe to enable more models (GLM, Nemotron etc.)
bcbe257b
template bias
1ae43b9b
am17an
force pushed
from
3245ced7
to
1ae43b9b
65 days ago
review: formatting
eeb9b04a
am17an
force pushed
from
3ad63db7
to
eeb9b04a
64 days ago
am17an
force pushed
from
3ad63db7
to
eeb9b04a
64 days ago
am17an
merged
3bcc9909
into master 63 days ago
am17an
deleted the topk-cuda-refactor branch 63 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub