hardsigmoid: add cuda kernels (#36351)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36351
Adds CUDA kernels for hardsigmoid, to enable its use in training.
Note: the update to the cpu backward pass is to keep the cpu vs cuda
logic consistent, no change in functionality.
Test Plan:
add CI for the forward pass
run this for the backward pass:
https://gist.github.com/vkuzo/95957d365600f9ad10d25bd20f58cc1a
Imported from OSS
Differential Revision: D20955589
fbshipit-source-id: dc198aa6a58e1a7996e1831f1e479c398ffcbc90