make clamp_min/max use minimum/maximum kernels, make clamp* correctly propagate nans (#77306)
Since clamp_min and maximum is the same op, reuse the same kernel (it also correctly propagate nans from both input and boundary, clamp* propagated from input only).
Also fixed codegen to make Tensor? overloads come before Scalar? overloads, cc @alband.
Fixes #67428 and #76795 (scalar overloads for clamp* are still not fixed, will do in the next PR).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77306
Approved by: https://github.com/albanD
Author
Natalia Gimelshein