Do not use double for single-prec upsample (#88277)
I'm not sure, what would be the best behaviour here, but it feels a bit strange to perform parts of `float32` computations as `float64` and then downcast them back to `float32`.
Use `at::opmath_type` rather than `at:acc_type` as no accumulation is used in the op.
I don't know much about double vs single precision scalar perf on x86 CPU, but before the change:
```
python -c "import timeit;import torch;x=torch.arange(100, dtype=torch.float32).reshape(1, 1, 10, 10); print(timeit.Timer(stmt='torch.nn.functional.interpolate(x, scale_factor=2.0, mode=\"bilinear\", align_corners=False)', globals={'x':x, 'torch':torch}).timeit())"
11.337517574429512
```
After the change:
```
$ python -c "import timeit;import torch;x=torch.arange(100, dtype=torch.float32).reshape(1, 1, 10, 10); print(timeit.Timer(stmt='torch.nn.functional.interpolate(x, scale_factor=2.0, mode=\"bilinear\", align_corners=False)', globals={'x':x, 'torch':torch}).timeit())"
10.513805857859552
```
I.e. roughly 7% perf degradation (measured on Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz)
NOTE:
- `aten::acc_type<float, false>` yields `double`
- `aten::acc_type<float, true>` return `float`.
Fixes https://github.com/pytorch/pytorch/issues/87968
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88277
Approved by: https://github.com/mingfeima, https://github.com/ngimel, https://github.com/jgong5