llama.cpp
3bcc9909 - CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.) (#19126)

Commit

104 days ago

CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.) (#19126)

References

#19126 - CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.)

Author

am17an

am17an

Parents

Loading