[PATCH] Migrate min from THC to ATen and remove _min (#38440)
Summary:
Related issue: https://github.com/pytorch/pytorch/issues/36900
Since I feel this PR is already large enough, I didn't migrate max in this PR. Legacy code is not cleaned up either. All these remaining work will be done in later PRs after this is merged.
Benchmark on an extreme case
```python
import torch
print(torch.__version__)
t = torch.randn(100000, 2, device='cuda')
warmup = torch.arange(100000000)
torch.cuda.synchronize()
%timeit t.min(dim=0); torch.cuda.synchronize()
```
Before: 4ms; After: 24.5us.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38440
Differential Revision: D21560691
Pulled By: ngimel