llama.cpp
CUDA: only allocate FA tmp buffer if needed
#18564
Merged

CUDA: only allocate FA tmp buffer if needed #18564

JohannesGaessler
JohannesGaessler CUDA: only allocate FA tmp buffer if needed
63043623
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
am17an
am17an approved these changes on 2026-01-03
JohannesGaessler JohannesGaessler merged 0f2e42ca into master 54 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone