Add matmul optimization for the case A.ndim <= 2 && B.ndim >= 3 (#20448)
Summary:
This addresses #18862.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20448
Differential Revision: D15393465
Pulled By: ezyang
fbshipit-source-id: 87e5b0ed8253ea00365f420d98ac96dd4e934028