llama.cpp
9b265118 - ggml-cpu: implement MXFP4 SIMD for s390x (#16193)

Commit
78 days ago
ggml-cpu: implement MXFP4 SIMD for s390x (#16193) * ggml-cpu: impl mxfp4 s390x Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: missing s = sumf Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix incorrect kval_mxfp4 type Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: rework mxfp4 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: missing delta calc Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix typo Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix typo for vec_splats Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: expand to 2 blocks per loop Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: add unroll to boost perf Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: back to 1 block per loop to test perf Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Revert "ggml-cpu: back to 1 block per loop to test perf" This reverts commit 1fe55724e2dc295701101bf838bdd4a512237492. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: rm unroll from single block Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> --------- Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Author
Parents
Loading