dbr quant overhead[10/x]: disable torch_function overrides for leaf nodes (#68836)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68836
If we have a leaf module like a `torch.nn.Conv2d`, DBR quant handles
the input and output of the module and should treat the inside of
this module as invisible. Specifically, there is no need to override
the `F.conv2d` call if the parent module is already being overridden.
Before this PR, `__torch_function__` was still overridden for the insides
of leaf modules, and the override was a no-op. There was some overhead
in these overrides because they were checking the hook type.
This PR adds a fast global override so we can skip overridding the insides
of leaf modules. This has some performance benefits in the prepare model,
because we now skip overriding all of the inner functions in observers.
Test Plan:
testing
```
python test/test_quantization.py TestQuantizeDBR
```
perf
```
// MobileNetV2, 1x3x224x224, comparing fp32 with dbr quant, Mac OS laptop
// before
fp32: 0.017837 seconds avg
fx_prepared: 0.021963 seconds avg, 0.812143 speedup vs fp32
fx_quantized: 0.012632 seconds avg, 1.412056 speedup vs fp32
dt_prepared: 0.034052 seconds avg, 0.523820 speedup vs fp32
dt_quantized: 0.018316 seconds avg, 0.973829 speedup vs fp32
// after
fp32: 0.020395 seconds avg
fx_prepared: 0.026969 seconds avg, 0.756230 speedup vs fp32
fx_quantized: 0.013195 seconds avg, 1.545611 speedup vs fp32
dt_prepared: 0.033432 seconds avg, 0.610023 speedup vs fp32
dt_quantized: 0.018244 seconds avg, 1.117866 speedup vs fp32
```
Reviewed By: jerryzh168
Differential Revision: D32630883
Pulled By: vkuzo
fbshipit-source-id: 6365e1c514726d8b2a4b3a51f114f5fed3ebe887