inductor: promote half/bfloat16 constant to float for cpu vectorization path (#105440)
As scalar path, we should also promote half/bfloat16 constant to float for better accuracy, after this PR, the TIMM ```dm_nfnet``` model amp path can be passed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105440
Approved by: https://github.com/jgong5, https://github.com/jansel