llama.cpp
e56abd20
- vulkan: Implement topk_moe fused shader, ported from CUDA (#16641)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
57 days ago
vulkan: Implement topk_moe fused shader, ported from CUDA (#16641) This is similar to the CUDA shader from #16130, but doesn't use shared memory and handles different subgroup sizes.
References
#16641 - vulkan: Implement topk_moe fused shader, ported from CUDA
Author
jeffbolznv
Parents
38355c6c
Loading