llama.cpp
06dfde3e
- llama : add basic support for offloading moe with CUDA
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
llama : add basic support for offloading moe with CUDA
References
#4406 - llama : add Mixtral support
Author
slaren
Committer
slaren
Parents
2cbcba82
Loading