metal : refactor + optimize (#15857)
* metal : refactor
ggml-ci
* cont : refactor FA-vec kernel
* cont : print metal library load time
* minor : warn to debug + bettern kernel names
ggml-ci
* metal : optimize mul_mv q8_0
ggml-ci
* metal : simplify FA pipeline creation functions
ggml-ci
* metal : improve naming consistency
* metal : safer function constants offsets
ggml-ci
* metal : comments
ggml-ci