Use simd version for fp16 conversions (#31897)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31897
Previous version only use avx2. The _simd version uses avx512 if CPU is capable of that.
Test Plan: Unitttest
Reviewed By: tracelogfb
Differential Revision: D19291499
fbshipit-source-id: 3b1ee0ba756e5c9defbd5caf7f68982d9b2ca06c