llama.cpp
36ca8b36 - CUDA: don't convert BF16 weights to FP32 (ggml/1174)

Commit

251 days ago

CUDA: don't convert BF16 weights to FP32 (ggml/1174) * add bf16 support * use convert_from_bf16_cuda instead of convert_unary_cuda for f32 * revert 7ec5085 * move functionality into convert_unary with constexpr

References

#12732 - sync : ggml

Author

CISC

Committer

ggerganov

Parents

995083e4

llama.cpp 36ca8b36 - CUDA: don't convert BF16 weights to FP32 (ggml/1174)

llama.cpp
36ca8b36 - CUDA: don't convert BF16 weights to FP32 (ggml/1174)