[PyTorch Edge] Conditionally trim dispatch key set to save heap memory at runtime (#65732)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65732
For certain on-device uses, runtime memory comes at a premium. On-device deployments won't use all the available dispatch keys, so it makes sense to keep only the on-device specific ones around for such uses to reduce runtime heap memory allocated.
This change keeps just 10 dispatch keys (the ones that used on-device), guarded under the `C10_MOBILE_TRIM_DISPATCH_KEYS` macro. it tries to keep the other code-paths unaffected and uses `constexpr` for use in the `array` declaration, and simple inline functions to ensure that the compiler is able to optimize these for server builds.
Test Plan:
Build and check mobile models end to end.
```
buck build -c "pt.enable_milan_dispatch_keys_trimming"=1 //xplat/caffe2/fb/lite_predictor:lite_predictor
```
Reviewed By: ezyang
Differential Revision: D31185407
fbshipit-source-id: e954765606373dea6ee9466a851dca7684167b0b