convert singrad to function op and remove cpu kernel (#13263)
### Description
Implemented gradient of sin as a function op.
### Motivation and Context
Sin gradient currently implemented as cpu op which could hurt
performance.
### Testing
built ORT from source: `./build.sh --config RelWithDebInfo
--enable_training --use_cuda --cuda_home /usr/local/cuda --cudnn_home
/usr/local/cuda --build_wheel --parallel --skip_tests`
tested SinGrad implementation: `cd build/Linux/RelWithDebInfo/ &&
./onnxruntime_test_all --gtest_filter=GradientCheckerTest.SinGrad`
Co-authored-by: Prathik Rao <prathikrao@microsoft.com>
Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>