llama.cpp
48e2fa9f
- CUDA: add fp kernel for larger batch size MoE (#16512)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
179 days ago
CUDA: add fp kernel for larger batch size MoE (#16512) * CUDA: kernel for larger batch sizes for MoE * WIP * WIP * WIP * WIP * WIP * WIP * fixup * tests * Move mmq_ids_helper to mmid * cleanup * Remove redundant checks
References
#16512 - CUDA: add fp kernel for larger batch size MoE
Author
am17an
Parents
5b6913c4
Loading