[quant][graphmode][fx] Optimize cat (#54813)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54813
Previously we have a cat that takes a list of Tensors with different qparams and dequantize them
cacatenate them and requantize with the output qparams. This adds some unnecessary overhead in dequantizing
and quantizing Tensors.
This PR adds an optimization for cat operator, we'll make sure inputs and output of cat
uses same observer/fake_quant and produce a cat that does not do rescaling.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27408377
fbshipit-source-id: 6a4bdcfd15e57ea1fe0f7e72d1e1288eb3ece4db