llama.cpp
CUDA: limit number of FA stream-k CUDA blocks
#20586
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
1
Changes
View On
GitHub
CUDA: limit number of FA stream-k CUDA blocks
#20586
JohannesGaessler
merged 1 commit into
ggml-org:master
from
JohannesGaessler:cuda-fa-min-chunk-size
CUDA: limit number of FA stream-k CUDA blocks
cc1232a4
am17an
approved these changes on 2026-03-15
github-actions
added
Nvidia GPU
github-actions
added
ggml
JohannesGaessler
merged
ae40cd27
into master
62 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
am17an
Assignees
No one assigned
Labels
Nvidia GPU
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub