pytorch
9e773ea7 - Use `accscalar_t` for CUDA add/sub with Tensor and Scalar (#60454)

Commit
3 years ago
Use `accscalar_t` for CUDA add/sub with Tensor and Scalar (#60454) Summary: Follow up of https://github.com/pytorch/pytorch/issues/60227, related to https://github.com/pytorch/pytorch/issues/59907 & https://github.com/pytorch/pytorch/issues/58833 With this pull request, `torch.add` & `torch.sub` use `acc_type` for `Scalar` if either of two arguments is `Scalar`. This mimics the behavior of [`torch.mul`](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/BinaryMulDivKernel.cu#L18), `torch._foreach_(add|sub).Scalar` and `torch._foreach_(add|sub).ScalarList`. --- **reference** - torch.mul CUDA kernel: https://github.com/pytorch/pytorch/blob/b0c9762e2d1dfcde549344628ad6be063378ef6a/aten/src/ATen/native/cuda/BinaryMulDivKernel.cu#L17-L25 - `torch._foreach_(add|sub).Scalar`: cast scalar https://github.com/pytorch/pytorch/blob/b0c9762e2d1dfcde549344628ad6be063378ef6a/aten/src/ATen/native/cuda/ForeachBinaryOpScalar.cu#L27 - `torch._foreach_(add|sub).ScalarList`: `BinaryOpScalarListFunctor` https://github.com/pytorch/pytorch/blob/b0c9762e2d1dfcde549344628ad6be063378ef6a/aten/src/ATen/native/cuda/ForeachFunctors.cuh#L180-L182 and multi_tensor_apply handles `scalar_t` and computes `opmath_t` (almost equivalent `accscalar_t`) https://github.com/pytorch/pytorch/blob/b0c9762e2d1dfcde549344628ad6be063378ef6a/aten/src/ATen/native/cuda/MultiTensorApply.cuh#L60-L68. BinaryOpScalarListFunctor is used https://github.com/pytorch/pytorch/blob/b0c9762e2d1dfcde549344628ad6be063378ef6a/aten/src/ATen/native/cuda/ForeachBinaryOpScalarList.cu#L24 cc ngimel ptrblck mcarilli Pull Request resolved: https://github.com/pytorch/pytorch/pull/60454 Reviewed By: VitalyFedyunin Differential Revision: D29345035 Pulled By: ngimel fbshipit-source-id: 5dbafbdfe029a9544ec2e58f17d547928e017a04
Author
Parents
  • aten/src/ATen/native/cuda
    • File
      BinaryAddSubKernel.cu
  • test
    • File
      test_binary_ufuncs.py