Fix incorrect CUDA `torch.nn.Embedding` result when max_norm is not None and indices are not sorted (#45248)
Summary:
Sorting indices before calling `thrust::unique` fixes the issue.
Fixes https://github.com/pytorch/pytorch/issues/44792
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45248
Reviewed By: mruberry
Differential Revision: D24194696
Pulled By: ngimel
fbshipit-source-id: ab59ef9d46b9917b1417bab25f80ce9780f0c930