pytorch
65aa2b65 - [TensorPipe] Close and join TP context at shutdown (#38934)

Commit
4 years ago
[TensorPipe] Close and join TP context at shutdown (#38934) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38934 The TensorPipe context contains all the threads and global state. It needs to be closed and joined upon shutdown (joining implicitly closes it). Destructing the context implicitly joins it, which is what was happening so far: we were waiting for the RPC agent to be destroyed for the TP context to be closed. However, I was seeing some TSAN errors that seemed to be happening during the process termination, where the SHM reactor thread was trying to log something on GoogleLog while a static member of GoogleLog was being destructed. I suspect this means that this means that the TP agent was being "leaked" (probably because the `RpcAgent::currentRpcAgent_` static field was still storing it) and thus was destroyed too late. The obvious solution seems to be to destroy it earlier, when GoogleLog is still active. Test Plan: I guess land this and see if the TSAN flakes keep happening? testinprod Differential Revision: D21703016 fbshipit-source-id: d117e619bb835192b1f3c8e2eb3cee94dbdb050f
Author
lw lw
Parents
Loading