Migrate addmm, addbmm and THBlas_gemm to ATen (#40927)
Summary:
Resubmit #40927
Closes https://github.com/pytorch/pytorch/issues/24679, closes https://github.com/pytorch/pytorch/issues/24678
`addbmm` depends on `addmm` so needed to be ported at the same time. I also removed `THTensor_(baddbmm)` which I noticed had already been ported so was just dead code.
After having already written this code, I had to fix merge conflicts with https://github.com/pytorch/pytorch/issues/40354 which revealed there was already an established place for cpu blas routines in ATen. However, the version there doesn't make use of ATen's AVX dispatching so thought I'd wait for comment before migrating this into that style.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40927
Reviewed By: ezyang
Differential Revision: D22468490
Pulled By: ngimel
fbshipit-source-id: f8a22be3216f67629420939455e31a88af20201d