ggml : optimize for ppc64le using VSX intrinsics (ggml/784)
* optimize for ppc64le using VSX intrinsics
* 1. code clean up by removing comments about overflow concern.
2. fix typo in suffix of scaling.
* Continue to fix typo in suffix of scaling for QK_K <> 256
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>