[MPS] Fix LSTM batch_first output transposed (#80597)
The output of LSTM with `batch_first` should be transposed back to batch first format.
Fixes #80306
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80597
Approved by: https://github.com/kulinseth