[quant][graphmode] Preserve numerics in debug option for clamp ops (#39219)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39219
We didn't model clamp ops correctly right now, this PR fixes that.
Reason is quantized clamp op quantizes the scalar arguments in the op implementation: https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp#L614-L617
So we'll need to model this explicitly in the IR.
When we see a `aten::dequantize - aten::clamp(%x, %min, %max)`
we first make a scalar tensor with `aten::scalar_tensor(%scalar, ...)`, then we quantize the tensor with the same quantization parameters from the input tensor of the `aten::clamp`, dequantize the tensor, then convert the dequantized tensor to scalar using `aten::item`.
Test Plan: Imported from OSS
Differential Revision: D21831350
fbshipit-source-id: d60731459a0465d64946aabc62065d25d92faefc