pytorch
46321cb9 - [static runtime] binding for aten::norm_out (#56636)

Commit
3 years ago
[static runtime] binding for aten::norm_out (#56636) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56636 Test Plan: Test it runs on the aug_1x model, which has aten::norm, and verify jit/sr results ``` ./buck-out/opt/gen/caffe2/caffe2/fb/predictor/ptvsc2_predictor_bench --scripted_model=/data/users/ansha/tmp/adfinder/aug_1x/210616848_0.predictor.disagg.local.local.pt --pt_inputs=/data/users/ansha/tmp/adfinder/aug_1x/210616848_0.predictor.disagg.input_data.container.pt --iters=500 --warmup_iters=500 --num_threads=1 --pt_enable_static_runtime=1 --pt_cleanup_activations=true --pt_enable_out_variant=1 --pt_optimize_memory=1 --compare_results=1 --do_profile=1 --adsfinder_compatibility=1 ``` ``` Time per node type: 1.53159 ms. 35.8619%. fb::sigrid_transforms_torch_bind (1 nodes) 0.9481 ms. 22.1996%. aten::linear (6 nodes) 0.704806 ms. 16.5029%. aten::argmin (1 nodes) 0.252252 ms. 5.90643%. aten::matmul (1 nodes) 0.140869 ms. 3.29842%. fb::clip_ranges_gather_sigrid_hash_v3 (77 nodes) 0.100014 ms. 2.34181%. fb::clip_ranges_gather (263 nodes) 0.0880838 ms. 2.06247%. aten::sub (1 nodes) 0.0553556 ms. 1.29614%. aten::repeat (1 nodes) 0.0438464 ms. 1.02665%. aten::norm (1 nodes) 0.0395956 ms. 0.927124%. fb::batch_box_cox (1 nodes) 0.035834 ms. 0.839045%. aten::__getitem__ (506 nodes) 0.0345233 ms. 0.808357%. prim::TupleUnpack (254 nodes) 0.0316876 ms. 0.741959%. aten::sigmoid (2 nodes) 0.0293246 ms. 0.686629%. aten::mul (3 nodes) 0.0287696 ms. 0.673635%. fb::offsets_to_ranges (253 nodes) 0.0242373 ms. 0.567511%. aten::pow (1 nodes) 0.0224204 ms. 0.52497%. fb::simple_embedding_bag_sum (3 nodes) 0.0200074 ms. 0.468469%. fb::casted_batch_one_hot_lengths (1 nodes) 0.0190264 ms. 0.445499%. fb::concat_add_mul_replacenan_clip (1 nodes) 0.0167253 ms. 0.39162%. prim::TupleConstruct (1 nodes) 0.0164962 ms. 0.386255%. aten::sum (3 nodes) 0.0158986 ms. 0.372262%. prim::DictConstruct (2 nodes) 0.0109372 ms. 0.256093%. aten::div (1 nodes) 0.00910563 ms. 0.213207%. prim::ListConstruct (4 nodes) 0.00876917 ms. 0.205328%. static_runtime::to_copy (8 nodes) 0.00822567 ms. 0.192603%. fb::sigrid_hash_precompute (1 nodes) 0.00622559 ms. 0.145771%. aten::contiguous (1 nodes) 0.00460064 ms. 0.107723%. aten::narrow (4 nodes) 0.00297164 ms. 0.0695804%. static_runtime::reshape_copy (2 nodes) 0.00287099 ms. 0.0672237%. aten::logit (1 nodes) 0.00277557 ms. 0.0649894%. aten::add (1 nodes) 0.00264978 ms. 0.0620441%. aten::clamp_min (1 nodes) 0.00215832 ms. 0.0505366%. aten::relu (1 nodes) 0.00213779 ms. 0.050056%. fb::gather_ranges (4 nodes) 0.00195846 ms. 0.0458571%. aten::full (1 nodes) 0.00177333 ms. 0.0415222%. aten::stack (1 nodes) 0.00147449 ms. 0.034525%. aten::size (3 nodes) 0.000762524 ms. 0.0178544%. aten::expand_as (1 nodes) 0.000757406 ms. 0.0177345%. fb::clip_ranges (2 nodes) 0.000614798 ms. 0.0143954%. fb::lengths_to_offsets (3 nodes) 0.000407952 ms. 0.00955212%. static_runtime::flatten_copy (1 nodes) 0.000159918 ms. 0.00374445%. prim::device (1 nodes) 4.2708 ms. in Total StaticRuntime setup time: 0.000407 ms Memory allocation time: 0.0089714 ms Memory deallocation time: 0.0592135 ms Outputs deallocation time: 0.0458097 ms Total memory managed: 947328 bytes Total number of reused tensors: 28 ``` Reviewed By: hlu1 Differential Revision: D27922070 fbshipit-source-id: 538b39b7fff0638fc994b7983bf32d9e9f15d016
Author
Parents
Loading