ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel (#13053)
* ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel
Signed-off-by: Dan Johansson <dan.johansson@arm.com>
* * code review fixes
Signed-off-by: Dan Johansson <dan.johansson@arm.com>
* * adds a comment that clarifies barrier usage
Signed-off-by: Dan Johansson <dan.johansson@arm.com>
---------
Signed-off-by: Dan Johansson <dan.johansson@arm.com>
Co-authored-by: Charles Xu <charles.xu@arm.com>