llama.cpp
8ad92dc1 - ggml : switch to padded F16 mask for ggml_soft_max, ggml_flash_attn_ext

Commit
1 year ago
ggml : switch to padded F16 mask for ggml_soft_max, ggml_flash_attn_ext
Author
Committer
Parents
  • File
    ggml-cuda.cu
  • File
    ggml-metal.m
  • ggml-metal.metal
  • File
    ggml.c
  • File
    ggml.h
  • File
    llama.cpp
  • tests
    • File
      test-backend-ops.cpp