[Onnxifi] Cache output shape inference result for OnnxifiOp (#37796)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37796
Shape inference is costly. In bad cases, if we have a lot of uneven tails, we are going to do quite amount of shape inference. This diff will enable each Onnxifi operator to cache the shape inference result for given batch size. In the worst case, we will occupy `num_inference_threads * max_batch_size` OutputReshapeInfo objects per model, where `num_inference_threads` and `max_batch_size` are smaller than 64.
Reviewed By: benjibc
Differential Revision: D21389946
fbshipit-source-id: 23473e64c338d64d15c70292cca0056205d980eb