llama.cpp
dc685be4 - CUDA: add FP32 FlashAttention vector kernel (#7188)

Commit
1 year ago
CUDA: add FP32 FlashAttention vector kernel (#7188) * CUDA: add FP32 FlashAttention vector kernel * fixup! CUDA: add FP32 FlashAttention vector kernel * fixup! fixup! CUDA: add FP32 FlashAttention vector kernel * fixup! fixup! fixup! CUDA: add FP32 FlashAttention vector kernel
Parents
  • File
    ggml-cuda.cu
  • ggml-cuda
    • File
      common.cuh
    • File
      fattn-common.cuh
    • File
      fattn-vec-f16.cu
    • File
      fattn-vec-f16.cuh
    • File
      fattn-vec-f32.cu
    • File
      fattn-vec-f32.cuh
    • File
      fattn.cu
  • tests
    • File
      test-backend-ops.cpp