llama.cpp
c13fcfbf - cuda : batched cuBLAS GEMMs for src0 F16 and src1 F32 (attention ops)

Commit
1 year ago
cuda : batched cuBLAS GEMMs for src0 F16 and src1 F32 (attention ops)
Author
Parents
  • File
    ggml-cuda.cu
Loading