fix bug for short model executions (#1391)
Summary:
If the models' execution times are too short, e.g. 1ms, the CPU peak memory measurements will be discarded. It is because there is some overhead to create and start the monitoring for peak memory measurements, and it causes the timestamps to be a little bigger than the stop_timestamp. It happens only for several tiny models such as `phlippe_resnet` with torchinductor and torchscript, `pyhpc_equation_of_state` with torchscript. I didn't observe the same thing happening on GPU memory records, but I also updated the GPU part.
This PR fixes this issue by reserving at least one record of CPU and GPU memory usage.
Pull Request resolved: https://github.com/pytorch/benchmark/pull/1391
Reviewed By: erichan1
Differential Revision: D42908415
Pulled By: xuzhao9
fbshipit-source-id: 0971ab6a4512b6c4cf7cd865ec9bd089472a1a32