Always emit end events even on failure, use thread local storage for stack (#2432)
Summary:
Pull Request resolved: https://github.com/pytorch/benchmark/pull/2432
X-link: https://github.com/pytorch/pytorch/pull/134279
We should always emit an end event in a finally block so that if a unit test or job fails, the stack is still correct.
Also, we use thread local storage for the stack, so that in multithreaded scenarios the stack will still be correctly added.
Reviewed By: laithsakka
Differential Revision: D61682556
fbshipit-source-id: ece87c4c198233f439d5af5678b56de7350b116e