clean up engine.cpp thread state (#63115)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63115
This actually changes:
- callbacks now run with proper grad mode even in worker threads
- graphtask's Future callbacks now run with proper TLS when erroring
out from a worker thread
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision: D30388100
Pulled By: albanD
fbshipit-source-id: 7ae9c461c2f0040548dd9e1e314f25e8da0c2e67