llama.cpp
CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8
#7681
Merged

CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 #7681

JohannesGaessler
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
slaren
slaren
slaren approved these changes on 2024-06-01
JohannesGaessler CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8
90021615
JohannesGaessler JohannesGaessler force pushed to 90021615 1 year ago
JohannesGaessler JohannesGaessler merged 750f60c0 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone