[nnc] Do not fuse matmul/conv2d if inputs are discontiguous. (#59754)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59754
Also, if inputs are contiguous, use their Placeholders
directly rather than generating contiguous Tensors from them.
The rationale for this change is that aten::matmul and aten::conv2d
support transposed inputs; if NNC generates a physical transpose to
perform an external call, performance will be strictly worse than not
fusing (sometimes dramatically so, as in the attached benchmark).
Test Plan: benchmark
Reviewed By: ZolotukhinM
Differential Revision: D29010209
fbshipit-source-id: da6d71b155c83e8d6e306089042b6b0af8f80900