[Static Runtime] Add first iter metric (#64457)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64457
The first iteration is special since it initializes the memory planner. This change logs and reports first iteration time during benchmarking. It also generates a FAI-PEP output when `generate_ai_pep_output` is set.
Test Plan:
Run any benchmark, and observe:
```
I0902 15:19:32.528977 2492358 impl.cpp:948] PyTorchObserver {"value":6.415958881378174,"unit":"ms","metric":"latency","type":"static_runtime_first_iter"}
...
First iter time: 6.41596 ms
```
Note that this metric is likely to have significantly more noise than the others since we don't have as many data points.
Unit tests: `buck test //caffe2/test:static_runtime`
Reviewed By: d1jang
Differential Revision: D30740619
fbshipit-source-id: 4dcfccd5629f4fa34254fd355073ef19e151245a