llama.cpp
dc685be4 - CUDA: add FP32 FlashAttention vector kernel (#7188)

Commit
1 year ago
CUDA: add FP32 FlashAttention vector kernel (#7188) * CUDA: add FP32 FlashAttention vector kernel * fixup! CUDA: add FP32 FlashAttention vector kernel * fixup! fixup! CUDA: add FP32 FlashAttention vector kernel * fixup! fixup! fixup! CUDA: add FP32 FlashAttention vector kernel
Parents
Loading