[TensorExpr] TensorExprKernel: don't do any compilation or lowering in run(). (#37948)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37948
The input JIT graph has all the information we need to perform the
entire compilation at the construction time. We don't need to postpone
any steps until the execution time. Also, from the graph we always know
what device we will be executing on and thus we don't need to have a
CodeGen cache in TensorExprKernel - we always have one and only one
CodeGen.
Test Plan: Imported from OSS
Reviewed By: protonu
Differential Revision: D21432145
Pulled By: ZolotukhinM
fbshipit-source-id: 8dc86b891713056b2c62f30170cd4a168912f027
Author
Mikhail Zolotukhin