llama.cpp
214b6a35
- ggml : adjust mul_mat_f16 work memory (#1226)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
ggml : adjust mul_mat_f16 work memory (#1226) * llama : minor - remove explicity int64_t cast * ggml : reduce memory buffer for F16 mul_mat when not using cuBLAS * ggml : add asserts to guard for incorrect wsize
References
#1226 - Adjust mul_mat_f16 work memory
Author
ggerganov
Parents
305eb5af
Loading