Check memory overlap in sort for large input sizes (#58327)
Summary:
The downstream cub sort doesn't support inplace sorting; this PR adds a check to bail out to allocating a new tensor instead of silently corrupting the returned indices.
CC ngimel zasdfgbnm
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58327
Reviewed By: mruberry
Differential Revision: D28661244
Pulled By: ngimel
fbshipit-source-id: 40617a7d3bfcebbe187bb706b6b753371bb99097