Index expanded dims before checking memory overlap (#98656)
As the comment for `get_expanded_dims` says:
```
# copy_ fails when trying to write to tensors with memory overlap,
# for expanded dimensions (a dimension which used to have size 1 -> ?)
# we can select one element from that dimension and write to it
# to achieve writing to all values of that dimension of the input tensor
```
We were doing this for the copy, for not for checking if we could copy. Update it so we index then check for memory overlap. This covers all of the `complex_striding` warnings I observed in TB.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98656
Approved by: https://github.com/ngimel, https://github.com/yf225