llama.cpp
metal : optimize ggml_mul_mat_id (faster Mixtral PP)
#4725
Merged

Commits
  • ggml : disable fast-math for Metal (cmake build only)
    ggerganov committed 2 years ago
  • metal : fix Metal API debug warnings
    ggerganov committed 2 years ago
  • cmake : add -fno-inline for Metal build (#4545)
    ggerganov committed 2 years ago
  • metal : fix API debug warnings
    ggerganov committed 2 years ago
  • metal : fix compile warnings
    ggerganov committed 2 years ago
  • metal : use uint64_t for strides
    ggerganov committed 2 years ago
  • cmake : rename option to LLAMA_METAL_SHADER_DEBUG
    ggerganov committed 2 years ago
  • metal : fix mat-vec Q8_0 kernel for BS > 1
    ggerganov committed 2 years ago
  • metal : normalize mat-vec kernel signatures
    ggerganov committed 2 years ago
  • cmake : respect LLAMA_QKK_64 option
    ggerganov committed 2 years ago
  • metal : fix mat-vec Q4_K kernel for QK_K == 64
    ggerganov committed 2 years ago
  • metal : optimizing ggml_mul_mat_id (wip)
    ggerganov committed 2 years ago
  • Merge branch 'master' into gg/metal-opt-mul-mat-id
    ggerganov committed 2 years ago
  • Merge branch 'master' into gg/metal-opt-mul-mat-id
    ggerganov committed 2 years ago
  • metal : minor fix
    ggerganov committed 2 years ago
  • Merge branch 'master' into gg/metal-opt-mul-mat-id
    ggerganov committed 2 years ago
  • metal : opt mul_mm_id
    ggerganov committed 2 years ago
Loading