llama.cpp
1.5 bit quantization
#5453
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
15
Changes
View On
GitHub
Commits
iq1_s: WIP basics
Iwan Kawrakow
committed
1 year ago
iq1_s: CUDA is working
Iwan Kawrakow
committed
1 year ago
iq1_s: scalar CPU dot product
Iwan Kawrakow
committed
1 year ago
iq1_s: WIP AVX2 dot product - something is not right
Iwan Kawrakow
committed
1 year ago
Fix tests
Iwan Kawrakow
committed
1 year ago
Fix shadow warnings
Iwan Kawrakow
committed
1 year ago
Fix after merge with latest master
Iwan Kawrakow
committed
1 year ago
iq1_s: AVX2 finally works
Iwan Kawrakow
committed
1 year ago
iq1_s: ARM_NEON dot product. Works, but not very fast
Iwan Kawrakow
committed
1 year ago
iq1_s: better grid
Iwan Kawrakow
committed
1 year ago
iq1_s: use IQ2_XXS for attn_output
Iwan Kawrakow
committed
1 year ago
iq1_s: Metal basics
Iwan Kawrakow
committed
1 year ago
iq1_s: Metal works, but quite slow
Iwan Kawrakow
committed
1 year ago
iq1_s: Tests
Iwan Kawrakow
committed
1 year ago
iq1_s: slightly faster dot product
Iwan Kawrakow
committed
1 year ago
Loading