llama.cpp
CUDA: larger SRAM reads for tile FA, AMD FP16 dot
#15927
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
2
Changes
View On
GitHub
CUDA: larger SRAM reads for tile FA, AMD FP16 dot
#15927
JohannesGaessler
merged 2 commits into
ggml-org:master
from
JohannesGaessler:cuda-fa-tile-mem-pattern-4
CUDA: larger SRAM reads for tile FA, AMD FP16 dot
8821183a
github-actions
added
Nvidia GPU
github-actions
added
ggml
fix logic for availability of v_dot2_f32_f16
fe4eb4f8
JohannesGaessler
force pushed
from
4ff67318
to
fe4eb4f8
8 days ago
slaren
approved these changes on 2025-09-11
JohannesGaessler
merged
0e6ff004
into master
8 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
slaren
Assignees
No one assigned
Labels
Nvidia GPU
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub