fix spatialbatchnorm on nnpi (#36987)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36987
the discrepancy comes from using eigen's sqrt.
Replaced it with std::sqrt and worked, so using MKLs version
Removed momentum, made epsilon float, enhanced the test with hypothesis
Test Plan: testing the mkl dependencies in prod, if things work, will remove the intrinsics implementation, if no, will use intrinsics
Reviewed By: yinghai
Differential Revision: D21151661
fbshipit-source-id: 56e617b13bc32b0020691f7201d16dee00f651b5