pytorch
0888b872 - [static runtime] binding for aten::clamp_min_out (#56635)

Commit
3 years ago
[static runtime] binding for aten::clamp_min_out (#56635) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56635 Test Plan: ``` ./buck-out/opt/gen/caffe2/caffe2/fb/predictor/ptvsc2_predictor_bench --scripted_model=/data/users/ansha/tmp/adfinder/aug_1x/210616848_0.predictor.disagg.local.local.pt --pt_inputs=/data/users/ansha/tmp/adfinder/aug_1x/210616848_0.predictor.disagg.input_data.container.pt --iters=500 --warmup_iters=500 --num_threads=1 --pt_enable_static_runtime=1 --pt_cleanup_activations=true --pt_enable_out_variant=1 --pt_optimize_memory=1 --compare_results=1 --do_profile=0 --adsfinder_compatibility=1 ``` ``` Time per node type: 1.50885 ms. 36.0064%. fb::sigrid_transforms_torch_bind (1 nodes) 0.92296 ms. 22.0251%. aten::linear (6 nodes) 0.695455 ms. 16.596%. aten::argmin (1 nodes) 0.237931 ms. 5.67787%. aten::matmul (1 nodes) 0.141634 ms. 3.37989%. fb::clip_ranges_gather_sigrid_hash_v3 (77 nodes) 0.0925469 ms. 2.2085%. fb::clip_ranges_gather (263 nodes) 0.0886556 ms. 2.11563%. aten::sub (1 nodes) 0.0549624 ms. 1.3116%. aten::repeat (1 nodes) 0.043996 ms. 1.0499%. aten::norm (1 nodes) 0.0403472 ms. 0.962826%. fb::batch_box_cox (1 nodes) 0.0371137 ms. 0.885664%. aten::sigmoid (2 nodes) 0.035054 ms. 0.836512%. aten::__getitem__ (506 nodes) 0.0338771 ms. 0.808427%. prim::TupleUnpack (254 nodes) 0.0288516 ms. 0.688502%. aten::mul (3 nodes) 0.026195 ms. 0.625106%. fb::offsets_to_ranges (253 nodes) 0.0243627 ms. 0.581381%. aten::pow (1 nodes) 0.0210347 ms. 0.501962%. fb::simple_embedding_bag_sum (3 nodes) 0.0195358 ms. 0.466192%. fb::casted_batch_one_hot_lengths (1 nodes) 0.0193484 ms. 0.461722%. fb::concat_add_mul_replacenan_clip (1 nodes) 0.0164265 ms. 0.391995%. aten::sum (3 nodes) 0.0157266 ms. 0.375291%. prim::TupleConstruct (1 nodes) 0.0156512 ms. 0.373493%. prim::DictConstruct (2 nodes) 0.0114427 ms. 0.273062%. aten::div (1 nodes) 0.00884876 ms. 0.211163%. static_runtime::to_copy (8 nodes) 0.00864496 ms. 0.206299%. prim::ListConstruct (4 nodes) 0.00803458 ms. 0.191734%. fb::sigrid_hash_precompute (1 nodes) 0.00619933 ms. 0.147938%. aten::contiguous (1 nodes) 0.00462827 ms. 0.110447%. aten::narrow (4 nodes) 0.00293105 ms. 0.0699452%. aten::logit (1 nodes) 0.00287083 ms. 0.0685082%. static_runtime::reshape_copy (2 nodes) 0.00250605 ms. 0.0598032%. aten::add (1 nodes) 0.00217015 ms. 0.0517875%. fb::gather_ranges (4 nodes) 0.00202655 ms. 0.0483607%. aten::full (1 nodes) 0.00200812 ms. 0.0479208%. aten::relu (1 nodes) 0.00175433 ms. 0.0418644%. aten::stack (1 nodes) 0.00174899 ms. 0.041737%. aten::clamp_min (1 nodes) 0.00134367 ms. 0.0320646%. aten::size (3 nodes) 0.000811416 ms. 0.0193633%. fb::clip_ranges (2 nodes) 0.000801096 ms. 0.019117%. aten::expand_as (1 nodes) 0.000541452 ms. 0.012921%. fb::lengths_to_offsets (3 nodes) 0.000477838 ms. 0.0114029%. static_runtime::flatten_copy (1 nodes) 0.000192906 ms. 0.00460342%. prim::device (1 nodes) 4.19049 ms. in Total StaticRuntime setup time: 0.000408 ms Memory allocation time: 0.00895982 ms Memory deallocation time: 0.0587527 ms Outputs deallocation time: 0.0430985 ms Total memory managed: 947328 bytes Total number of reused tensors: 28 W0421 14:33:55.610956 836281 PyTorchPredictorContainer.cpp:200] Failed to load metadata file W0421 14:33:55.611043 836281 PyTorchPredictorContainer.cpp:457] Couldn't find model param config file xl_model_weights/model_param_config I0421 14:33:55.611063 836281 PyTorchPredictorBenchLib.cpp:137] PyTorch predictor: number of prediction threads 1 I0421 14:33:55.736069 836281 PyTorchPredictorBenchLib.cpp:230] PyTorch run finished. Milliseconds per iter: 124.995. Iters per second: 8.0003 I0421 14:33:55.874794 836281 PtVsBlackBoxPredictorBenchLib.cpp:132] Finished comparing PT static runtime and jit interpreter results ``` Reviewed By: hlu1 Differential Revision: D27922570 fbshipit-source-id: 095aa9bd0c425bc73eb48841653441d5c9e45744
Author
Parents
Loading