[Static Runtime] support fork and wait operations on Static Runtime (#79211)
Summary:
- Initial support for fork was done on JIT interpreter. This patch enabled the async execution on static runtime
- For each forked node, seeprate runtime is created for the execution of subgraph. Async execution is handled by aten::ParallelThreadPoolNative threadpool
- aten::wait waits on the future of fork to be completed
Test Plan:
local test cases:
- buck test caffe2/benchmarks/static_runtime/fb:test_fb_operators
- buck test mode/opt caffe2/benchmarks/static_runtime:static_runtime_cpptest
- buck test mode/opt caffe2/test:static_runtime
Async execution of the subgraph is tested by adding pytorch profiler hooks on the StaticRuntime execution via below code. Async execution in threadpool is verfiied by checking trace
with profile(activities=[ProfilerActivity.CPU]) as prof:
static_runtime_module(inputs)
prof.export_chrome_trace("trace.json")
Differential Revision: D37044513
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79211
Approved by: https://github.com/mikeiovine