Add maskrcnn_benchmark
* Install cuda in CI env
* Add mini coco dataset
* Copy maskrcnn code from git@github.com:wconstab/maskrcnn-benchmark.git
* Implement hubconf.py train() function for cuda no-jit only
* Haven't tried to JIT.
* Change maskrcnn imports to relative
- move maskrcnn into torchbenchmark subdir
- update only some of the imports, there are more...
* disable apex/amp to avoid cuda mem leak