[quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat (#54924)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54924
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).
Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27416528
fbshipit-source-id: 896c280abec2903c29d597c655729666583ff0dd