Fix many type mismatches in the CUDA version of calc_digamma and calc_trigamma (#25791)
Summary:
- There are some missing casts.
- Functions like ::log, ::sin will potentially always invoke the double version on host. For
example, compiling the following code:
```c++
#include <cmath>
float log_float(float f) {
return ::logf(f);
}
double log_double(double f) {
return ::log(f);
}
float log_float2(float f) {
return ::log(f);
}
float log_float3(float f) {
return std::log(f);
}
```
using `g++ -c -O3` leads to:
log_float(float):
jmp logf
log_double(double):
jmp log
log_float2(float):
subq $8, %rsp
cvtss2sd %xmm0, %xmm0
call log
addq $8, %rsp
cvtsd2ss %xmm0, %xmm0
ret
log_float3(float):
jmp logf
Note that log_float2 delegates the call to the double version of log
(surrounded by cast), while log_float3 delegates the call correctly to
logf. See https://godbolt.org/z/KsRWwW
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25791
Differential Revision: D17452312
Pulled By: izdeby
fbshipit-source-id: 6276a011a373cd7cb144f9ecd84116aa206e7d1b