[inductor] fix accuracy failure for a few models under freezing (#121054)
Fix https://github.com/pytorch/pytorch/issues/120545 . The reason why these models fail accuracy test with freezing is due to the conv-batchnorm fusion. Conv-batchnorm fusion causes relative big numerical churn.
For the failed TIMM models, raising the tolerance to `8 * 1e-2` can make the test pass.
For the failed TB models, the numerical difference is too large. Having a discussion with @eellison , we decided to skip them with freezing for now.
One the other hand, we probably should dig more why the conv-bn fusion cause such large numerical difference.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121054
Approved by: https://github.com/eellison