llama.cpp
CUDA: generalize FP16 fattn vec kernel
#7061
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
7
Changes
View On
GitHub
CUDA: generalize FP16 fattn vec kernel
#7061
JohannesGaessler
merged 7 commits into
ggml-org:master
from
JohannesGaessler:cuda-fa-no-tc-5
JohannesGaessler
force pushed
to
57bde8c2
1 year ago
CUDA: generalize FP16 fattn vec kernel
48463c0b
disable unsupported head sizes for AMD in test
86636bd1
try AMD fix
617f129e
fix batch size 2-8
d9bcb92f
partially revert changes
fa81c3a2
mofosyne
added
enhancement
fix performance regression
22727651
fix compiler warning
fece1fe4
JohannesGaessler
force pushed
from
78ee06e5
to
fece1fe4
1 year ago
mofosyne
added
Review Complexity : High
slaren
approved these changes on 2024-05-09
JohannesGaessler
merged
a743d76a
into master
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
slaren
Assignees
No one assigned
Labels
enhancement
Review Complexity : High
Milestone
No milestone
Login to write a write a comment.
Login via GitHub