Probable fix for out of place BinaryOpScalar bad values and/or IMAs on 11.2 (ci-all edition) (#52634)
Summary:
Should close https://github.com/pytorch/pytorch/issues/51992.
ci-all resubmit of https://github.com/pytorch/pytorch/pull/52591. The plot also thickened considerably since then. Every foreach functor, it turns out, has bad `r_args` accesses for certain code paths and instantiations.
Also, I noticed the [`n % kILP == 0`](https://github.com/pytorch/pytorch/blob/2680ff7759d8a441eada383ba7aa0fa42c7d35ed/aten/src/ATen/native/cuda/ForeachFunctors.cuh#L87) condition for vectorization in all functors is way too restrictive: it'll refuse to vectorize anything on any tensor whose overall numel is not a multiple of ILP. That's out of scope though.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52634
Reviewed By: H-Huang
Differential Revision: D26725991
Pulled By: izdeby
fbshipit-source-id: 4bade0ac186bf85527baddc1c44b2c2b8e3c9777