llama.cpp
CUDA: generalize FP16 fattn vec kernel
#7061
Merged

CUDA: generalize FP16 fattn vec kernel #7061

JohannesGaessler
sorasoras
JohannesGaessler
github-actions
8XXD8
JohannesGaessler
8XXD8
JohannesGaessler
sorasoras
vonjackustc
sorasoras
JohannesGaessler
jdecourval
sorasoras
JohannesGaessler JohannesGaessler force pushed to 57bde8c2 1 year ago
JohannesGaessler
sorasoras
JohannesGaessler
slaren
JohannesGaessler
slaren
ggerganov
JohannesGaessler
jdecourval
JohannesGaessler CUDA: generalize FP16 fattn vec kernel
48463c0b
JohannesGaessler disable unsupported head sizes for AMD in test
86636bd1
JohannesGaessler try AMD fix
617f129e
JohannesGaessler fix batch size 2-8
d9bcb92f
JohannesGaessler partially revert changes
fa81c3a2
JohannesGaessler
mofosyne mofosyne added enhancement
JohannesGaessler fix performance regression
22727651
JohannesGaessler fix compiler warning
fece1fe4
JohannesGaessler JohannesGaessler force pushed from 78ee06e5 to fece1fe4 1 year ago
JohannesGaessler
mofosyne mofosyne added Review Complexity : High
slaren
slaren
slaren approved these changes on 2024-05-09
JohannesGaessler JohannesGaessler merged a743d76a into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone