Avoid copies in matmul (#76828)
With this PR, matmul just folds a bmm into a mm o mv if and only if it
can achieve so without copying. We add tests for this to make sure that
our algorithm to detect this is accurate.
For the cases where it was copying before see https://github.com/pytorch/pytorch/pull/75197#discussion_r843413208 https://github.com/pytorch/pytorch/pull/75197#discussion_r863489479 https://github.com/pytorch/pytorch/pull/75197#discussion_r863489805
Fixes https://github.com/pytorch/pytorch/issues/76702
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76828
Approved by: https://github.com/ngimel