benchmark
bf0e5a96 - Run performance test non-alternately (#131935)

Commit

1 year ago

Run performance test non-alternately (#131935) Summary: By default, performance tests (speedup experiments) will run the baseline and test backend alternately. However, this does not work for the torchao backend, which will change the model in-place, therefore the baseline run will also run with torchao backend since the model has already been quantized. Add a new experiment "latency_experiment" to run performance tests non-alternately (first run baseline for a few iterations, then run the test backend). other changes: need to add torch.compiler.cudagraph_mark_step_begin() to avoid the slowdown from # Unable to hit fast path of CUDAGraphs because of pending, uninvoked backwards also updated the torchao APIs to the current versions X-link: https://github.com/pytorch/benchmark/pull/2394 Originally Reviewed By: xuzhao9 X-link: https://github.com/pytorch/pytorch/pull/131935 Approved by: https://github.com/xuzhao9 Reviewed By: xuzhao9, PaliC Differential Revision: D60252821 Pulled By: HDCharles fbshipit-source-id: 08ad452c5fcb34182c9aa7da1fe761db9587de71

Author

HDCharles

Committer

facebook-github-bot

Parents

9a99b9b0

benchmark bf0e5a96 - Run performance test non-alternately (#131935)

benchmark
bf0e5a96 - Run performance test non-alternately (#131935)