Backout D70075331
Summary:
X-link: https://github.com/pytorch/pytorch/pull/148824
The AOTI lowering for model 699109736 and other new models worked before D70075331, but failed after with error "RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasLtMatmul with transpose_mat1 1 transpose_mat2 0 m 4096 n 10 k 7936 mat1_ld 7936 mat2_ld 7936 result_ld 4096 abcType 2 computeType 68 scaleType 0"
So we revert D70075331 as a workaround now.
Reviewed By: chenyang78, adelesun
Differential Revision: D70823254
fbshipit-source-id: f3025a7543b7b2299457f5a06091a6fbeb37dc0d