llama.cpp
[CUDA] Reduce CPU-side stalls due to the CUDA command buffer being full
#19042
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
4
Changes
View On
GitHub
[CUDA] Reduce CPU-side stalls due to the CUDA command buffer being full
#19042
ggerganov
merged 4 commits into
ggml-org:master
from
gaugarg-nv:pp_perf_improve
[CUDA] Reduce CPU-side stalls due to the CUDA command buffer being full
d3298dc3
Set the env variable in the CUDA backend registry allocation
29c73efe
github-actions
added
Nvidia GPU
github-actions
added
ggml
Add link to PR in code comment
14de97eb
JohannesGaessler
commented on 2026-01-24
Remove warning logs and update documentation
ed2e4840
github-actions
added
documentation
JohannesGaessler
approved these changes on 2026-01-26
ggerganov
approved these changes on 2026-01-26
ggerganov
merged
a83c73a1
into master
73 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
JohannesGaessler
Assignees
No one assigned
Labels
documentation
Nvidia GPU
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub