llama.cpp
baad9488 - ggml : Q2k interleaving implementation - x86/x64 SIMD (#14373)

Commit
70 days ago
ggml : Q2k interleaving implementation - x86/x64 SIMD (#14373) * Initial Q2_K Block Interleaving Implementation * Addressed review comments and clean up of the code * Post rebase fixes * Initial CI/CD fixes * Update declarations in arch-fallback.h * Changes for GEMV Q2_K in arch-fallback.h * Enable repacking only on AVX-512 machines * Update comments in repack.cpp * Address q2k comments --------- Co-authored-by: Manogna-Sree <elisetti.manognasree@multicorewareinc.com>
Author
Parents
Loading