Use fp32 in cuBLAS V100 to avoid overflows, env variables to override cuBLAS compute type (#19959)
* Update ggml-cuda.cu
* Update ggml-cuda.cu
* Update build.md
* Update build.md
* Update ggml/src/ggml-cuda/ggml-cuda.cu
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* Update ggml-cuda.cu
* Update build.md
* Update ggml/src/ggml-cuda/ggml-cuda.cu
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* Update build.md
* Update ggml-cuda.cu
* Update ggml-cuda.cu
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>