Wconstab/score (#131)
* Add generate_score_config.py and compute_score.py
- Add baseline score config (score.yml) and check in
output torchbench_0.0.yaml
* Have CI upload benchmark score to scribe
* Remove maskrcnn from score temporarily
* Add --benchmark_data_dir option to compute_score.py
* Update run_bench_upload to use new compute_score flag
* Fix benchmark filename used in score computation
* Move dependencies from CI script to requirements.txt and add tabulate
* Change score computation to exp sum weighted log
This compuation has a more intuitive appeal, as
a 2x across the board improvement would yield a 2x score
improvement.
Also add 'hack_data' option to enable quick experiments
starting with real data and 'editing' only a keyword match set
of measurements by some factor.
* Add assert for missing data during score computation
* Temporarily remove cyclegan/stargan from score as they have been disabled on PR jobs preventing score calculation on PRs