benchmark
443c7016 - Add newly_run and no_longer_run metrics to output yaml (#1509)

Commit
2 years ago
Add newly_run and no_longer_run metrics to output yaml (#1509) Summary: This PR changes the contracts for what needs to be implemented. Previously, users must handle when the two metrics jsons do not have the same set of keys. Now, we record the mismatches in no_longer_run_in_treatment and newly_run_in_treatment and guarantee that the keys will definitely match when the jsons enter the user-defined run function. This would output a yaml that looks like: ``` control_env: pytorch_git_version: 00891e96e8f2444785ae908c428514a726c27da8 treatment_env: pytorch_git_version: 00891e96e8f2444785ae908c428514a726c27da8 bisection: null details: BERT_pytorch, Adadelta, cuda, (pt2) default: control: 0.009517530572008003 treatment: 0.009517530572008003 delta: 0.0 BERT_pytorch, Adadelta, cuda, default: control: 0.008748639142140746 treatment: 0.008748639142140746 delta: 0.0 BERT_pytorch, Adadelta, cuda, (pt2) maximize: control: 0.010465960879810155 treatment: 0.010465960879810155 delta: 0.0 ... no_longer_run_in_treatment: BERT_pytorch, Adadelta, cuda, (pt2) foreach, maximize: 0.010405212640762329 BERT_pytorch, Adadelta, cuda, foreach, maximize: 0.009411881134534875 BERT_pytorch, Adagrad, cuda, (pt2) foreach, maximize: 0.03404413016202549 ... newly_run_in_treatment: BERT_pytorch, Adadelta, cuda, (pt2) differentiable: 0.0033336214274944116 BERT_pytorch, Adadelta, cuda, differentiable: 0.017110475042136385 BERT_pytorch, Adagrad, cuda, (pt2) differentiable: 0.003775304475500477 BERT_pytorch, Adagrad, cuda, differentiable: 0.007527894619852304 BERT_pytorch, Adam, cuda, (pt2) amsgrad, maximize: 0.00928849776127291 ... ``` A potential downside here is that users may WANT to handle the mismatches themselves, and this removes that knob for them. The alternative could be to allow users to just put in NaNs for the missing values and process the results from the YAML later. This would establish a different kind of contract that NaNs are used whenever the actual measurement is missing. I'm not sure which is better. The YAML would then look like: ``` control_env: pytorch_git_version: 00891e96e8f2444785ae908c428514a726c27da8 treatment_env: pytorch_git_version: 00891e96e8f2444785ae908c428514a726c27da8 bisection: null details: BERT_pytorch, Adadelta, cuda, (pt2) default: control: 0.009517530572008003 treatment: 0.009517530572008003 delta: 0.0 BERT_pytorch, Adadelta, cuda, default: control: 0.008748639142140746 treatment: 0.008748639142140746 delta: 0.0 BERT_pytorch, Adadelta, cuda, (pt2) maximize: control: 0.010465960879810155 treatment: 0.010465960879810155 delta: 0.0 BERT_pytorch, Adadelta, cuda, (pt2) foreach, maximize: control: 0.010405212640762329 treatment: NaN delta: NaN ... BERT_pytorch, Adadelta, cuda, (pt2) differentiable: control: NaN treatment: 0.0033336214274944116 delta: NaN ... ``` Pull Request resolved: https://github.com/pytorch/benchmark/pull/1509 Reviewed By: xuzhao9 Differential Revision: D44518353 Pulled By: janeyx99 fbshipit-source-id: d701cf886a7126f0776644cc3ba6d7150441cc66
Author
Parents
Loading