Update reduce norm1/norm2 and layernorm kernels with ROCm 4.3.1 (#9399)
* update layernorm to reflect the fix in ROCm 4.3.1
* fix UT
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>