llama.cpp
SOTA 2-bit quants - part 2
#4856
Merged

Commits
  • iq2_xs: basics
    Iwan Kawrakow committed 2 years ago
  • iq2_xs: this should have been in the basics
    Iwan Kawrakow committed 2 years ago
  • iq2_xs: CUDA and scalar CPU works
    Iwan Kawrakow committed 2 years ago
  • iq2_xs: WIP Metal
    Iwan Kawrakow committed 2 years ago
  • iq2_xs: Metal now works
    Iwan Kawrakow committed 2 years ago
  • iq2_xs: working, but dog slow, ARM_NEON dot product
    Iwan Kawrakow committed 2 years ago
  • iq2_xs: better ARM_NEON dot product
    Iwan Kawrakow committed 2 years ago
  • iq2_xs: AVX2 dot product - 19.5 t/s
    Iwan Kawrakow committed 2 years ago
  • iq2_xs: faster AVX2 dit product
    Iwan Kawrakow committed 2 years ago
  • iq2_xs: had forgotten to delete iq2-data.h
    Iwan Kawrakow committed 2 years ago
  • Add llama enum for IQ2_XS
    Iwan Kawrakow committed 2 years ago
Loading