pytorch
cdf40875 - [benchmarks] Disabling gradscaler (#89741)

Commit
2 years ago
[benchmarks] Disabling gradscaler (#89741) Disabling Gradscaler because 1) Benchmark setup runs 2 iterations of fwd-bwd. So, not useful. 2) Current setup shares grad_scaler for eager and dynamo model, which is bad as Gradscaler has state and can adjust the scaling factor between eager and dynamo run, making accuracy check harder. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89741 Approved by: https://github.com/ngimel
Author
Committer
Parents
Loading