llama.cpp
CUDA: FA support for Deepseek (Ampere or newer)
#13306
Merged

CUDA: FA support for Deepseek (Ampere or newer) #13306

JohannesGaessler
JohannesGaessler CUDA: FA support for Deepseek (Ampere or newer)
d19838e9
github-actions github-actions added Nvidia GPU
github-actions github-actions added python
github-actions github-actions added ggml
CISC
CISC commented on 2025-05-04
Panchovix
slaren
Panchovix
JohannesGaessler JohannesGaessler force pushed 268 days ago
JohannesGaessler wrap __cvta_generic_to_shared for HIP
187054a7
JohannesGaessler JohannesGaessler force pushed to 187054a7 268 days ago
JohannesGaessler fix loop unrolling for KV data load
dd054465
JohannesGaessler
Panchovix
jukofyork
jukofyork
JohannesGaessler
slaren
JohannesGaessler
JohannesGaessler
slaren
slaren approved these changes on 2025-05-08
Panchovix
JohannesGaessler do loop unrolling via C++ template
fe2b775a
JohannesGaessler JohannesGaessler merged 0cf6725e into master 265 days ago
JohannesGaessler
Dampfinchen
JohannesGaessler
LostRuins
JohannesGaessler
LostRuins
LostRuins
JohannesGaessler
LostRuins

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone