llama.cpp
ggml: implement quantized KV cache for FA
#7372
Merged

ggml: implement quantized KV cache for FA #7372

JohannesGaessler
slaren
github-actions
JohannesGaessler JohannesGaessler force pushed 2 years ago
JohannesGaessler
ggerganov
ggerganov approved these changes on 2024-05-19
JohannesGaessler ggml: implement quantized KV cache for FA
b7da2e86
JohannesGaessler JohannesGaessler force pushed to b7da2e86 2 years ago
JohannesGaessler JohannesGaessler merged 5ca49cbe into master 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone