pytorch
d305d4a5 - [Dynamo] Fix TIMM benchmark compute_loss (#97423)

Commit

1 year ago

[Dynamo] Fix TIMM benchmark compute_loss (#97423) Fixes #97382 #95416 fixed a critical bug in dynamo benchmark, where AMP tests fall back to eager mode before that PR. However, after that PR, we found [a list of TIMM models amp + eager + training testing failed](https://docs.google.com/spreadsheets/d/1DEhirVOkj15Lu4UNawIUon9MqkVLaWqyT-DQPif5NHk/edit#gid=0). Now we identified the root cause is: high loss values make gradient checking harder, as small changes in accumulation order upset accuracy checks. We should switch to the helper function ```reduce_to_scalar_loss``` which has been used by Torchbench tests. After switching to ```reduce_to_scalar_loss```, TIMM models accuracy pass rate grows from 67.74% to 91.94% in my local test. The rest 5 failed models(ese_vovnet19b_dw, fbnetc_100, mnasnet_100, mobilevit_s, sebotnet33ts_256) need further investigation and handling, but I think it should be similar reason. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97423 Approved by: https://github.com/Chillee

Author

yanboliang

Committer

pytorchmergebot

Parents

5f5d6755

pytorch d305d4a5 - [Dynamo] Fix TIMM benchmark compute_loss (#97423)

pytorch
d305d4a5 - [Dynamo] Fix TIMM benchmark compute_loss (#97423)