llama.cpp
1873ff58 - metal : add gqa8 kernel to allow llama-2-70B on metal (#2459)

Commit

2 years ago

metal : add gqa8 kernel to allow llama-2-70B on metal (#2459) * Added gqa8 kernel to allow llama-2-70B on metal * Update ggml-metal.m Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com> * Extend kernel_mul_mat_f16_f32 to handle gqa broadcast * Added ne03==ne13 assertion --------- Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com>

References

#2459 - Updated mul_mat_f16_f32 metal kernel to allow llama-2-70B on metal

Author

mbosc

Parents

49e7cb5b

llama.cpp 1873ff58 - metal : add gqa8 kernel to allow llama-2-70B on metal (#2459)

llama.cpp
1873ff58 - metal : add gqa8 kernel to allow llama-2-70B on metal (#2459)