[ROCm] Add compute type for Skiplayernorm to fix ROCm CI (#15192)
- Add compute type for Skiplayernorm to fix ROCm CI and get more
accurate results.
SkipLayerNorm:
type T: input, skip, bias
type U: epsilon, compute result
type V: output, beta, gamma
- refactor the usage of aligned_vector, reduce the usage of
`reinterpret_cast`.