llama.cpp
2b3a665d
- llama : use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 (#4996)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
llama : use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 (#4996) Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
References
#4996 - Use Q4_K for attn_v for Q2_K_S when n_gqa >= 4
Author
ikawrakow
Parents
75632936
Loading