llama.cpp
Use fp32 in cuBLAS V100 to avoid overflows, env variables to override cuBLAS compute type
#19959

Merged

Use fp32 in cuBLAS V100 to avoid overflows, env variables to override cuBLAS compute type #19959

am17an merged 13 commits into ggml-org:master from forced_cublas_new

Update ggml-cuda.cu

d2237d39

Update ggml-cuda.cu

c5ec63ac

github-actions added Nvidia GPU

github-actions added ggml

Update build.md

b1939bd4

github-actions added documentation

Update build.md

5b51d4d6

JohannesGaessler commented on 2026-03-02

Update ggml/src/ggml-cuda/ggml-cuda.cu

787710d0

JohannesGaessler commented on 2026-03-03

Merge branch 'ggml-org:master' into forced_cublas_new

57641d55

Update ggml-cuda.cu

b84198c7

Update build.md

2e7693be

wallentri88 requested a review from

JohannesGaessler 24 days ago

JohannesGaessler approved these changes on 2026-03-04

ORippler commented on 2026-03-04

Update ggml/src/ggml-cuda/ggml-cuda.cu

d8ff8ebe

Update build.md

ad27ec11

Merge branch 'ggml-org:master' into forced_cublas_new

808594df

Update ggml-cuda.cu

77a8e2a6

Update ggml-cuda.cu

d23aac8b

am17an merged f2c0dfb7 into master 14 days ago

wallentri88 deleted the forced_cublas_new branch 13 days ago

Reviewers

JohannesGaessler

ORippler

Assignees

No one assigned

Labels

documentation Nvidia GPU ggml

Milestone

No milestone

llama.cpp Use fp32 in cuBLAS V100 to avoid overflows, env variables to override cuBLAS compute type #19959 Merged

Use fp32 in cuBLAS V100 to avoid overflows, env variables to override cuBLAS compute type #19959

llama.cpp
Use fp32 in cuBLAS V100 to avoid overflows, env variables to override cuBLAS compute type
#19959

Merged