[Static Runtime] Add pass to eliminate __getitem__/DictConstruct calls (#62429)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62429
Introduce a new pass to eliminate calls to `prim::DictConstruct/aten::__getitem__`. Given a graph like this:
```
%2 : Dict = prim::DictConstruct(%key, %value)
%3 : Tensor = aten::__getitem__(%2, %key)
%4 : Tensor = op(%3)
```
This pass produces a graph like this (after dead code elimination):
```
%4 : Tensor = op(%value)
```
This optimization is applied in the static runtime.
Test Plan:
`buck test //caffe2/test:jit -- TestPeephole`
**local.forward performance summary**
About 3% runtime benefit. All `DictConstruct` calls optimized out, `__getitem__` calls reduced significantly (~50% of them are cut out)
P438354810
**local_request_only.forward performance summary**
About 14% runtime benefit. Again, all `DictConstruct` calls optimized out, 50% `__getitem__` calls removed.
P438359742
There is some variance with runtime measurements, so take these numbers with a grain of salt. Also note that the benefit does not exist in the shrunk model since there are no `DictConstruct` calls
Reviewed By: hlu1
Differential Revision: D29995087
fbshipit-source-id: f376376a46ff808115afd2d60446e5db8f6f752f