llama.cpp
36ca8b36
- CUDA: don't convert BF16 weights to FP32 (ggml/1174)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
251 days ago
CUDA: don't convert BF16 weights to FP32 (ggml/1174) * add bf16 support * use convert_from_bf16_cuda instead of convert_unary_cuda for f32 * revert 7ec5085 * move functionality into convert_unary with constexpr
References
#12732 - sync : ggml
Author
CISC
Committer
ggerganov
Parents
995083e4
Loading