llama.cpp
cuda: refactored ssm_scan and use CUB
#13291
Merged

Commits
  • cuda: refactored ssm_scan to use CUB
    Your-Cheese committed 331 days ago
  • fixed compilation error when when not using CUB
    Your-Cheese committed 331 days ago
  • assign L to constant and use size_t instead of int
    Your-Cheese committed 324 days ago
  • deduplicated functions
    Your-Cheese committed 324 days ago
  • change min blocks per mp to 1
    Your-Cheese committed 324 days ago
  • Use cub load and store warp transpose
    Your-Cheese committed 324 days ago
  • Merge https://github.com/ggml-org/llama.cpp into ssm_scan_cub
    Your-Cheese committed 239 days ago
  • suppress clang warning
    Your-Cheese committed 235 days ago
Loading