simplify profile text output by displaying only top-level ops statistics (#42262)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42262
Test Plan:
Imported from OSS
```
==================================================================================================================================================================================
TEST
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg Number of Calls Input Shapes
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
aten::add_ 3.61% 462.489us 3.61% 462.489us 462.489us 1 [[3, 20], [3, 20], []]
aten::slice 1.95% 249.571us 1.95% 250.018us 250.018us 1 [[3, 80], [], [], [], []]
aten::lstm 1.89% 242.534us 22.41% 2.872ms 2.872ms 1 [[5, 3, 10], [], [], [], [], [], [], [], []]
aten::lstm 1.68% 215.852us 18.18% 2.330ms 2.330ms 1 [[5, 3, 10], [], [], [], [], [], [], [], []]
aten::lstm 1.68% 215.767us 18.49% 2.370ms 2.370ms 1 [[5, 3, 10], [], [], [], [], [], [], [], []]
aten::lstm 1.60% 205.014us 20.15% 2.582ms 2.582ms 1 [[5, 3, 10], [], [], [], [], [], [], [], []]
aten::lstm 1.55% 198.213us 18.53% 2.375ms 2.375ms 1 [[5, 3, 10], [], [], [], [], [], [], [], []]
aten::addmm 0.95% 122.359us 1.01% 129.857us 129.857us 1 [[80], [3, 20], [20, 80], [], []]
aten::stack 0.29% 36.745us 0.63% 80.179us 80.179us 1 [[], []]
aten::add_ 0.28% 35.694us 0.28% 35.694us 35.694us 1 [[3, 20], [3, 20], []]
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
Self CPU time total: 12.817ms
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg Number of Calls Input Shapes
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
aten::mul 11.45% 1.467ms 12.88% 1.651ms 11.006us 150 [[3, 20], [3, 20]]
aten::lstm 8.41% 1.077ms 97.76% 12.529ms 2.506ms 5 [[5, 3, 10], [], [], [], [], [], [], [], []]
aten::addmm 7.65% 979.982us 11.38% 1.459ms 29.182us 50 [[80], [3, 20], [20, 80], [], []]
aten::sigmoid_ 6.78% 869.295us 9.74% 1.249ms 8.327us 150 [[3, 20]]
aten::add_ 5.82% 745.801us 5.82% 745.801us 14.916us 50 [[3, 20], [3, 20], []]
aten::slice 5.58% 715.532us 6.61% 847.445us 4.237us 200 [[3, 80], [], [], [], []]
aten::unsafe_split 4.24% 544.015us 13.25% 1.698ms 33.957us 50 [[3, 80], [], []]
aten::tanh 3.11% 398.881us 6.05% 775.024us 15.500us 50 [[3, 20]]
aten::empty 3.04% 389.055us 3.04% 389.055us 1.319us 295 [[], [], [], [], [], []]
aten::sigmoid 2.96% 379.686us 2.96% 379.686us 2.531us 150 [[3, 20], [3, 20]]
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
Self CPU time total: 12.817ms
==================================================================================================================================================================================
TEST
==================================================================================================================================================================================
This report only display top-level ops statistics
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg Number of Calls Input Shapes
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
aten::lstm 1.89% 242.534us 22.41% 2.872ms 2.872ms 1 [[5, 3, 10], [], [], [], [], [], [], [], []]
aten::lstm 1.68% 215.852us 18.18% 2.330ms 2.330ms 1 [[5, 3, 10], [], [], [], [], [], [], [], []]
aten::lstm 1.68% 215.767us 18.49% 2.370ms 2.370ms 1 [[5, 3, 10], [], [], [], [], [], [], [], []]
aten::lstm 1.60% 205.014us 20.15% 2.582ms 2.582ms 1 [[5, 3, 10], [], [], [], [], [], [], [], []]
aten::lstm 1.55% 198.213us 18.53% 2.375ms 2.375ms 1 [[5, 3, 10], [], [], [], [], [], [], [], []]
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
Self CPU time total: 12.817ms
==================================================================================================================================================================================
This report only display top-level ops statistics
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg Number of Calls Input Shapes
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
aten::lstm 8.41% 1.077ms 97.76% 12.529ms 2.506ms 5 [[5, 3, 10], [], [], [], [], [], [], [], []]
----------------------------- --------------- --------------- --------------- --------------- --------------- --------------- ---------------------------------------------
Self CPU time total: 12.817ms
Total time based on python measurements: 13.206ms
CPU time measurement python side overhead: 3.03%
```
Reviewed By: ilia-cher
Differential Revision: D22830328
Pulled By: ilia-cher
fbshipit-source-id: c9a71be7b23a8f84784117c788faa43caa96f545