llama.cpp
1873ff58 - metal : add gqa8 kernel to allow llama-2-70B on metal (#2459)

Commit
2 years ago
metal : add gqa8 kernel to allow llama-2-70B on metal (#2459) * Added gqa8 kernel to allow llama-2-70B on metal * Update ggml-metal.m Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com> * Extend kernel_mul_mat_f16_f32 to handle gqa broadcast * Added ne03==ne13 assertion --------- Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com>
Author
Parents
Loading