[static runtime] binding for aten::argmin_out (#56638)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56638
Test Plan:
```
./buck-out/opt/gen/caffe2/caffe2/fb/predictor/ptvsc2_predictor_bench --scripted_model=/data/users/ansha/tmp/adfinder/aug_1x/210616848_0.predictor.disagg.local.local.pt --pt_inputs=/data/users/ansha/tmp/adfinder/aug_1x/210616848_0.predictor.disagg.input_data.container.pt --iters=500 --warmup_iters=500 --num_threads=1 --pt_enable_static_runtime=1 --pt_cleanup_activations=true --pt_enable_out_variant=1 --pt_optimize_memory=1 --compare_results=1 --do_profile=1 --adsfinder_compatibility=1
```
```
Time per node type:
1.55901 ms. 35.3486%. fb::sigrid_transforms_torch_bind (1 nodes)
0.986321 ms. 22.3636%. aten::linear (6 nodes)
0.722277 ms. 16.3767%. aten::argmin (1 nodes)
0.256231 ms. 5.80971%. aten::matmul (1 nodes)
0.149653 ms. 3.39319%. fb::clip_ranges_gather_sigrid_hash_v3 (77 nodes)
0.105381 ms. 2.38938%. fb::clip_ranges_gather (263 nodes)
0.0911405 ms. 2.06649%. aten::sub (1 nodes)
0.0605429 ms. 1.37273%. aten::repeat (1 nodes)
0.0456569 ms. 1.03521%. aten::norm (1 nodes)
0.0421855 ms. 0.956501%. fb::batch_box_cox (1 nodes)
0.0370142 ms. 0.839249%. aten::__getitem__ (506 nodes)
0.0359091 ms. 0.814193%. prim::TupleUnpack (254 nodes)
0.0338332 ms. 0.767123%. aten::sigmoid (2 nodes)
0.0315159 ms. 0.714582%. aten::mul (3 nodes)
0.0297553 ms. 0.674662%. fb::offsets_to_ranges (253 nodes)
0.0279913 ms. 0.634666%. fb::simple_embedding_bag_sum (3 nodes)
0.0233521 ms. 0.529478%. aten::pow (1 nodes)
0.021296 ms. 0.48286%. fb::concat_add_mul_replacenan_clip (1 nodes)
0.0208991 ms. 0.473861%. fb::casted_batch_one_hot_lengths (1 nodes)
0.0183163 ms. 0.415298%. aten::sum (3 nodes)
0.0164318 ms. 0.372571%. prim::DictConstruct (2 nodes)
0.0160191 ms. 0.363211%. prim::TupleConstruct (1 nodes)
0.0126953 ms. 0.287849%. aten::div (1 nodes)
0.0106084 ms. 0.240532%. static_runtime::to_copy (8 nodes)
0.0092846 ms. 0.210516%. prim::ListConstruct (4 nodes)
0.00916175 ms. 0.207731%. fb::sigrid_hash_precompute (1 nodes)
0.00707015 ms. 0.160307%. aten::contiguous (1 nodes)
0.00621954 ms. 0.14102%. aten::narrow (4 nodes)
0.00302307 ms. 0.0685441%. aten::add (1 nodes)
0.00290759 ms. 0.0659259%. aten::full (1 nodes)
0.00283369 ms. 0.0642503%. aten::logit (1 nodes)
0.00239244 ms. 0.0542455%. fb::gather_ranges (4 nodes)
0.00220181 ms. 0.0499232%. aten::relu (1 nodes)
0.00211563 ms. 0.0479691%. static_runtime::reshape_copy (2 nodes)
0.0020059 ms. 0.0454812%. aten::stack (1 nodes)
0.00186682 ms. 0.0423276%. aten::clamp_min (1 nodes)
0.00172548 ms. 0.039123%. aten::size (3 nodes)
0.0011853 ms. 0.0268751%. aten::expand_as (1 nodes)
0.000881784 ms. 0.0199933%. fb::clip_ranges (2 nodes)
0.000835602 ms. 0.0189462%. fb::lengths_to_offsets (3 nodes)
0.000444376 ms. 0.0100757%. static_runtime::flatten_copy (1 nodes)
0.000197078 ms. 0.00446848%. prim::device (1 nodes)
4.4104 ms. in Total
StaticRuntime setup time: 0.000702 ms
Memory allocation time: 0.00943333 ms
Memory deallocation time: 0.062704 ms
Outputs deallocation time: 0.0477171 ms
Total memory managed: 831744 bytes
Total number of reused tensors: 31
W0421 14:53:04.841202 929500 PyTorchPredictorContainer.cpp:200] Failed to load metadata file
W0421 14:53:04.841315 929500 PyTorchPredictorContainer.cpp:457] Couldn't find model param config file xl_model_weights/model_param_config
I0421 14:53:04.841341 929500 PyTorchPredictorBenchLib.cpp:137] PyTorch predictor: number of prediction threads 1
I0421 14:53:04.971776 929500 PyTorchPredictorBenchLib.cpp:230] PyTorch run finished. Milliseconds per iter: 130.423. Iters per second: 7.66736
I0421 14:53:05.122830 929500 PtVsBlackBoxPredictorBenchLib.cpp:132] Finished comparing PT static runtime and jit interpreter results
```
Reviewed By: hlu1
Differential Revision: D27923172
fbshipit-source-id: 05cf5497fb6ac39dd3ff24f583607a3dff8cae95