[aot] fix disable amp for runtime wrapper (#97864)
For the current runtime wrapper in aot, `disable_amp` is always set to True. In fact, we would like to avoid disabling autocast if possible because accessing TLS is slow. In this PR, `disable_amp` depends on whether there is any autocast enabled instead of always being True. Many operators would get an improvement of performance (inductor v.s. eager) with this fix.
Example of operators' 0.8 speedup in torchbench (inductor v.s. eager):
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:///C:/Users/xuanliao/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
<link rel=File-List
href="file:///C:/Users/xuanliao/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
</head>
<body link="#0563C1" vlink="#954F72">
| current | new
-- | -- | --
aten.hardsigmoid.default | 0.709372349 | 0.81414306
aten.tanh.default | 0.715227805 | 0.855556349
aten.add.Scalar | 0.682292123 | 0.860371222
aten.sigmoid_backward.default | 0.688039934 | 0.915606579
</body>
</html>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97864
Approved by: https://github.com/EikanWang, https://github.com/jansel, https://github.com/jgong5, https://github.com/bdhirsh