[TensorExpr] Fix propagation of loop options when splitting loops (#40035)
Summary:
Fix a bug in SplitWithTail and SplitWithMask where loop_options such as Cuda block/thread bindings are overwritten by the split. This PR fixes this bug by propagating the loop options to the outer loop, which for axis bindings should be equivalent.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40035
Reviewed By: ZolotukhinM
Differential Revision: D22080263
Pulled By: nickgg
fbshipit-source-id: b8a9583fd90f69319fc4bb4db644e91f6ffa8e67