llama.cpp
CUDA: don't route RDNA3.5 flash attention to the rocWMMA kernel
#24562

Open

CUDA: don't route RDNA3.5 flash attention to the rocWMMA kernel #24562

liminfei-amd wants to merge 1 commit into ggml-org:master from liminfei-amd:amd-rocm/24437-rdna35-fattn-route-off-rocwmma

CUDA: don't route RDNA3.5 flash attention to the rocWMMA kernel

b311e833

github-actions added Nvidia GPU

github-actions added ggml

liminfei-amd marked this pull request as ready for review 2 days ago

liminfei-amd requested a review from

IMbackK 2 days ago

liminfei-amd requested a review 2 days ago

Reviewers

IMbackK

Assignees

No one assigned

Labels

Nvidia GPU ggml

Milestone

No milestone