Memory profiling (#37775)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37775
Adding memory usage into profiler table output
Test Plan:
BUILD_BINARY=1 USE_BLAS=MKL USE_MKLDNN=0 USE_CUDA=0 python setup.py
develop install --cmake
```
import torch
import torchvision.models as models
model = models.resnet18()
inp = torch.randn(5, 3, 224, 224)
with torch.autograd.profiler.profile(profile_memory=True, record_shapes=True) as prof:
model(inp)
print(prof.key_averages(group_by_input_shape=True).table(sort_by="cpu_memory_usage", row_limit=15))
```
```
--------------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- -----------------------------------
Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg CPU Mem Total Number of Calls Input Shapes
--------------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- -----------------------------------
resize_ 0.37% 577.936us 0.37% 577.936us 9.796us 339.03 Mb 59 [[0]]
empty 0.69% 1.061ms 0.74% 1.139ms 5.556us 47.42 Mb 205 []
stride 0.00% 0.853us 0.00% 0.853us 0.853us 19.53 Kb 1 [[5, 1000]]
empty_strided 0.01% 21.393us 0.02% 26.033us 5.207us 252 b 5 []
is_complex 0.02% 37.425us 0.02% 37.425us 1.291us 208 b 29 [[]]
masked_select 0.04% 55.333us 0.06% 93.616us 46.808us 120 b 2 [[30], [30]]
conv2d 0.01% 18.009us 9.62% 14.902ms 14.902ms 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
convolution 0.01% 12.436us 9.61% 14.884ms 14.884ms 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
_convolution 0.03% 52.381us 9.60% 14.871ms 14.871ms 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
size 0.00% 5.429us 0.00% 5.429us 0.339us 0 b 16 [[5, 3, 224, 224]]
contiguous 0.00% 1.934us 0.00% 1.934us 0.967us 0 b 2 [[5, 3, 224, 224]]
_convolution_nogroup 0.02% 27.505us 9.57% 14.814ms 14.814ms 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
_nnpack_available 0.02% 34.267us 0.02% 34.267us 1.713us 0 b 20 []
thnn_conv2d 0.01% 13.274us 9.54% 14.771ms 14.771ms 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
thnn_conv2d_forward 5.98% 9.264ms 19.02% 29.446ms 14.723ms 0 b 2 [[5, 3, 224, 224], [64, 3, 7, 7], [
--------------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- -----------------------------------
Self CPU time total: 154.855ms
```
Reviewed By: ngimel
Differential Revision: D21384248
Pulled By: ilia-cher
fbshipit-source-id: 31359cce2aa06f6255ed1ad8c60d03cb640bfec3