llama.cpp
CUDA: Conv2d Tensor Core
#15813
Closed

Commits
  • CUDA: cov2d with tensor core
    mnehete32 committed 16 days ago
  • CUDA: conv2d added comment
    mnehete32 committed 16 days ago
  • CUDA: conv2d support fp16 without wmma
    mnehete32 committed 16 days ago
  • CUDA: conv2d using mma.cuh
    mnehete32 committed 9 days ago
  • CUDA: conv2d convert int64_t to int
    mnehete32 committed 9 days ago
  • CUDA: conv2d update block size
    mnehete32 committed 8 days ago
  • Merge branch 'master' of https://github.com/mnehete32/llama.cpp into conv2d_tensor_core
    mnehete32 committed 5 days ago
  • CUDA: conv2d performance optimization
    mnehete32 committed 5 days ago
  • CUDA: conv2d minor fixes
    mnehete32 committed 5 days ago
Loading