fix #56822 (#56967)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/56822
There was an off by one in CPU randperm when checking the limits of the requested range. Also shows up in the "CUDA" version as it will fallback to CPU for small input sizes.
CC zasdfgbnm
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56967
Reviewed By: mruberry
Differential Revision: D28031819
Pulled By: ngimel
fbshipit-source-id: 4d25995628997f164aafe94e7eae6c54f018e4e5