Port TH cum{sum,prod}_cuda to ATen (#36458)
Summary:
References: https://github.com/pytorch/pytorch/issues/24521 #24522 https://github.com/pytorch/pytorch/issues/24547 #24548 https://github.com/pytorch/pytorch/issues/24507
Depends on https://github.com/pytorch/pytorch/issues/36308
Changes related to this PR are only in file :
aten/src/ATen/Declarations.cwrap
aten/src/ATen/native/cuda/ReduceOpsKernel.cu
aten/src/ATen/native/native_functions.yaml
aten/src/THC/generic/THCTensorMathScan.cu
aten/src/THC/generic/THCTensorMathScan.h
Please Review VitalyFedyunin
Thanks.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36458
Differential Revision: D21718384
Pulled By: ngimel
fbshipit-source-id: 5af15164050c77be164397abd659a48c9ded2b29