HIP: Refactor mma for RDNA and CDNA (#17990)
* mma.cuh for rdna4
* mma for rdna3
* mmq for rdna4
* mmq for rdna3
* align i-major and j-major
* cdna
* fix cuda error
* add missing tile of mfma
* fix j-major wrong ne on CDNA
* fix gramma and empty spaces
---------
Co-authored-by: zhang hui <you@example.com>