llama.cpp
Use Q4_K for attn_v for Q2_K_S when n_gqa >= 4
#4996
Merged

Use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 #4996

ggerganov merged 1 commit into master from ik/better_q2_k_s
ikawrakow
Use Q4_K for attn_v for Q2_K_S when n_gqa >= 4
9fd1e83f
ggerganov
ggerganov approved these changes on 2024-01-17
ggerganov ggerganov merged 2b3a665d into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone