fix scatter CPU kernel when (input size, src size) > index size (#25839)
Summary:
fixes https://github.com/pytorch/pytorch/issues/25836
According to doc, https://pytorch.org/docs/stable/tensors.html#torch.Tensor.scatter_ `index` must have the smallest size and we should iterate over `index` instead of `tensor`.
cc: dlibenzi
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25839
Differential Revision: D17269116
Pulled By: ailzhang
fbshipit-source-id: 0e8569fed6c0d2dd70e4e3ec5d29d8730cd2ae8f