[inuctor] fix the layout problem for nll_loss2d_backward (#121173)
Fixes https://github.com/pytorch/pytorch/issues/120759 .
The CUDA implementation of nll_loss2d_backward.default requires that the 'self' tensor to be contiguous. These implicit assumption may be broken by layout optimizations. The fix here is to add the constraint when we explicitly defining the fallback for the op.
Not sure if we can improve the cuda kernel to release the constraints though.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121173
Approved by: https://github.com/jansel, https://github.com/desertfire