Fix NaN handling in torch.mv. (#31666)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31666
List of changes:
1) Fix a case where torch.mv was not handling NaNs correctly. In particular, with a transposed tensor and expanded vector, NaNs in the output are kept, even if beta = 0.
This is handled in the `out=` case by zero-ing out the passed-in Tensor, but this can happen just the same with the non-out variant if the allocated tensor happens to have a NaN.
Also adds tests for this case.
NOTE: we zero out the output tensor in all cases for mv and mm, even though this is probably overkill. I didn't find another case where this would be a problem, but the old code at least
attempted to do this for all mv and mm calls and I didn't add comprehensive testing to be sure that it's not a problem.
2) on CPU: move mv, mv_out, mm, mm_out to be direct wrappers on _th_addmv, _th_addmm, rather than having their own wrappers in Declarations.cwrap.
Ths is to remove the magic around cpu_zero from the codegen, which simplifies the codegen and makes testing this easier.
Test Plan: Imported from OSS
Differential Revision: D19239953
Pulled By: gchanan
fbshipit-source-id: 27d0748d215ad46d17a8684696d88f4cfd8a917e