llama.cpp
7f251fdb - ggml-cpu: Optimized x86 and generic cpu q1_0 dot (follow up) (#21636)

Commit
20 days ago
ggml-cpu: Optimized x86 and generic cpu q1_0 dot (follow up) (#21636) * Implemented optimized q1_0 dot for x86 and generic * Removed redundant helper definition * Removed two redundant instructions from AVX q1_0 dot * Fixed inconsistency with fp16 conversion for generic q1_0 dot and deduplicated generic fallback * Style cleanup around AVX q1_0 dot * Replaced explicitly unrolled blocks with inner for loop for q1_0 * Replaced scalar ARM q1_0 impl with new generic one
Author
Parents
Loading