llama.cpp
8141e730 - ggml-cpu: support K tails in power10 Q8/Q4 MMA matmul (#24753)

Commit
1 day ago
ggml-cpu: support K tails in power10 Q8/Q4 MMA matmul (#24753) * ggml-cpu: support K tails in Power10 MMA Q8/Q4 matmul This patch removes the requirement that K be divisible by kc in the tinyBlas_Q0_PPC tiled matmul path. Process the final K panel using its actual depth and pass the reduced panel size through packing and kernel execution. This allows more workloads to use the MMA kernel and reduces fallback to mnpack. * Apply suggestion from @taronaeo Co-authored-by: Aaron Teo <taronaeo@gmail.com> --------- Co-authored-by: Aaron Teo <taronaeo@gmail.com>
Author
Parents
Loading