llama.cpp
48e2fa9f - CUDA: add fp kernel for larger batch size MoE (#16512)

Commit
179 days ago
CUDA: add fp kernel for larger batch size MoE (#16512) * CUDA: kernel for larger batch sizes for MoE * WIP * WIP * WIP * WIP * WIP * WIP * fixup * tests * Move mmq_ids_helper to mmid * cleanup * Remove redundant checks
Author
Parents
Loading