[Pytorch Edge][tracing-based] build tracer in OSS (#64087)
Summary:
1. Introduce
```
MobileModelRunner.h
MobileModelRunner.cpp
TensorUtils.h
TensorUtils.cpp
```
in external. They are pretty much the same as internal, except namespace and the dependency in folly. In next prs, TensorUtils and MobileModelRunner are unified between external and internal.
2. Introduce
```
tracer.cpp
```
for external. Majority is the same as internal one, with some cleanup on unnecessary dependency. It's unified between internal and external in next change.
3. Add an executable to build the tracer. It will be built for desktop only.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64087
ghstack-source-id: 139900300
Test Plan:
Given the model
```
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.lin = nn.Linear(10, 1)
def forward(self, x):
return self.lin(x)
model = Net()
scripted_module = torch.jit.script(model)
example_dict = {'a' : 1, 'b' : 2}
sample_input = {
scripted_module.forward : [(torch.zeros(1,10),)],
}
bundled_model = torch.utils.bundled_inputs.bundle_inputs(scripted_module, sample_input)
bundled_model._save_for_lite_interpreter("dummy_model_with_bundled_input.ptl")
```
External tracer
```
./build/bin/model_tracer --model_input_path "/Users/chenlai/Documents/pytorch/tracing/dummy_model_with_bundled_input.ptl" --build_yaml_path "/Users/chenlai/Documents/pytorch/tracing/tmp.yaml"
```
and compare `tmp.yaml` with the operator list generated from
Internal tracer
```
./fbcode/caffe2/fb/model_tracer/run_model_with_bundled_inputs.sh ~/local/notebooks/prod_models/dummy_model_with_bundled_input.ptl
```
QNNPACK only:
Example yaml from internal tracer: P460742166 [devserver]
Example yaml from external tracer: P460759099 [mac], P460742166 [devserver]
Comparison ops between internal and external on devserver:
{F666923807}
{F666924048}
Note: The operators generated on mac and devservers are different, the one on deserver includes two extra ops: `aten::addmm_, aten::slow_conv_dilated2d"`. Based on the traced list, when calling `aten::_convolution`, one calls `aten::mkldnn_convolution`, and the other calls `aten::_convolution_nogroup`, causing the divergence.
Thanks for Martin for pointing out:
> mkldnn is another backend from Intel
Reviewed By: dhruvbird
Differential Revision: D30599136
fbshipit-source-id: 102f23fb652c728a9ee4379f9acc43ae300d8e8a