Better handing of Autograd+Fork errors. (#33885)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33885
Fixes: #32835
Fixes: #5834
Can not combine with CUDA's implementation as each of them requires individual `std::once_flag` as well as different `forked_autograd_child` functions. CUDA version relays to python module, autograd uses TORCH_CHECK to report error to python and cpp.
Test Plan: Imported from OSS
Differential Revision: D20144024
Pulled By: VitalyFedyunin
fbshipit-source-id: e7cf30568fff5110e9df7fe5b23f18ed992fa17f