llama.cpp
SOTA 2-bit quants - part 2
#4856
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
11
Changes
View On
GitHub
Commits
iq2_xs: basics
Iwan Kawrakow
committed
2 years ago
iq2_xs: this should have been in the basics
Iwan Kawrakow
committed
2 years ago
iq2_xs: CUDA and scalar CPU works
Iwan Kawrakow
committed
2 years ago
iq2_xs: WIP Metal
Iwan Kawrakow
committed
2 years ago
iq2_xs: Metal now works
Iwan Kawrakow
committed
2 years ago
iq2_xs: working, but dog slow, ARM_NEON dot product
Iwan Kawrakow
committed
2 years ago
iq2_xs: better ARM_NEON dot product
Iwan Kawrakow
committed
2 years ago
iq2_xs: AVX2 dot product - 19.5 t/s
Iwan Kawrakow
committed
2 years ago
iq2_xs: faster AVX2 dit product
Iwan Kawrakow
committed
2 years ago
iq2_xs: had forgotten to delete iq2-data.h
Iwan Kawrakow
committed
2 years ago
Add llama enum for IQ2_XS
Iwan Kawrakow
committed
2 years ago
Loading