llama.cpp
24af22fc
- ggml : optimize cuda ssm_scan using warp-level reduction (#18505)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
9 days ago
ggml : optimize cuda ssm_scan using warp-level reduction (#18505) * ggml : optimize cuda ssm_scan using warp-level reduction * ggml : apply code review suggestions (style, const, constexpr) * ggml : add TODO regarding stride consistency
References
#18505 - ggml : optimize cuda ssm_scan using warp-level reduction
Author
Aadeshveer
Parents
07fbe19f
Loading