llama.cpp
ggml: implement quantized KV cache for FA
#7372
Merged

ggml: implement quantized KV cache for FA #7372

JohannesGaessler
slaren
github-actions
JohannesGaessler JohannesGaessler force pushed 1 year ago
JohannesGaessler
ggerganov
ggerganov approved these changes on 2024-05-19
JohannesGaessler ggml: implement quantized KV cache for FA
b7da2e86
JohannesGaessler JohannesGaessler force pushed to b7da2e86 1 year ago
JohannesGaessler JohannesGaessler merged 5ca49cbe into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone