fix bug for record collections (#1366)

Commit

3 years ago

fix bug for record collections (#1366) Summary: When using model analyzer to collect GPU and CPU metrics, the monitor threads are not safely exiting, and the last bunch of records is discarded. It has a small impact on the final metric values because we aggregate all values and use the average number. But when we enable the `--export-metrics` option to export all detailed metric information, the discarded records have a big influence on the result correctness. This PR fixes the bug via the following method. - use `self._thread.wait()` for the monitor threads to finish. - add an extra `self._monitoring_iteration()` to collect the last bunch of records after setting `self._thread_active` to false. Pull Request resolved: https://github.com/pytorch/benchmark/pull/1366 Reviewed By: erichan1 Differential Revision: D42464909 Pulled By: xuzhao9 fbshipit-source-id: 1f0bbe798d4fdd6271013694e499be5e9d40a252

Author

FindHao

Committer

facebook-github-bot

Parents

eb49327b

benchmark 7ab73738 - fix bug for record collections (#1366)

benchmark
7ab73738 - fix bug for record collections (#1366)