[GPU] Fix the broken strides value for 2d transpose (#50310)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50310
Swapping the stride value is OK if the output tensor's storage stays in-contiguous. However, when we copy the result back to CPU, we expect to see a contiguous tensor.
```
>>> x = torch.rand(2,3)
>>> x.stride()
(3, 1)
>>> y = x.t()
>>> y.stride()
(1, 3)
>>> z = y.contiguous()
>>> z.stride()
(2, 1)
```
ghstack-source-id: 119692581
Test Plan: Sandcastle CI
Reviewed By: AshkanAliabadi
Differential Revision: D25823665
fbshipit-source-id: 61667c03d1d4dd8692b76444676cc393f808cec8