Fix std_mean f16 opinfo test by using reference_in_float (#109081)
It seems that the compiled f16 op is more accurate than the eager f16
op:
**Compiled float16 vs Eager float64**
Mismatched elements: 25 / 25 (100.0%)
Greatest absolute difference: 3.718038455710615e-05 at index (1, 0) (up to 1e-07 allowed)
Greatest relative difference: 0.0018021699903143316 at index (0, 4) (up to 1e-07 allowed)
**Eager float16 vs Eager float64**
Mismatched elements: 25 / 25 (100.0%)
Greatest absolute difference: 7.280254198286512e-05 at index (3, 3) (up to 1e-07 allowed)
Greatest relative difference: 0.004104326045245938 at index (0, 4) (up to 1e-07 allowed)
**Compiled float16 vs Eager float16**
Mismatched elements: 7 / 25 (28.0%)
Greatest absolute difference: 7.62939453125e-05 at index (3, 3) (up to 1e-05 allowed)
Greatest relative difference: 0.00588226318359375 at index (0, 4) (up to 0.001 allowed)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109081
Approved by: https://github.com/eellison