Run tests and benchmarks in subprocesses for isolation (#423) (#426)
* Proof of concept for SubprocessWorker integration
* clean up subprocess benchmark code.
* move benchmarking components to pytorch/benchmark repo
* cleanup, and fix pipes on windows. (Hopefully)
* small windows fixes
* more tweaks to windows pipes
* optimize worker run from 700 us to 70 us. (And no-op TorchBench model from 2ms to 90 us)
* clean and document SubprocessWorker, and add benchmark for worker overhead
* move test.py to use subprocess tests
* fix typo
* Add timeout loop, and misc other hardening
* clean up integration between SubprocessWorker and TorchBench
* Add timeout to unit tests. (But not benchmark runs.)
* fix signature
* actually fix tests
* factor function parsing into dedicated function and add unit tests
* tweak import message
* enable TestParseFunction
* Make timeout kill worker, and expand tests
* delete model before checking for memory leak
* run component tests in CI, and harden path handling
* fix typo in test.py, and make constructor failure more robust
* install expecttest
* make benchmark test verbose
* debug: run only drq tests
* add verbose to debug CI failure
* garbage collect before and after tests
* remove debugging hack
* Force quantization to release Tensors
* fix strong_gc_collect
* fix YoloV3 cuda train device string parse
* remove quantization gc workaround, and do a bit of gc factoring
* add some more detail to the comments
* fix gc_collect
* actually fix gc_collect
* Only allow a single ModelTask to exist at a time