pytorch
aa4f27c1 - Prefer accurate reciprocal on ARMv8 (#59361)

Commit
3 years ago
Prefer accurate reciprocal on ARMv8 (#59361) Summary: Default NEON accelerated implementation of reciprocal uses vrecpeq_f32 which yield Newton-Raphson approximation rather than actual value Use regular NEON accelerated division for reciprocal and reciprocal square root operations. This fixes `test_reference_numerics_hard_frac_cpu_float32`, `test_reference_numerics_normal_rsqrt_cpu_float32` etc Pull Request resolved: https://github.com/pytorch/pytorch/pull/59361 Reviewed By: mruberry Differential Revision: D28870456 Pulled By: malfet fbshipit-source-id: e634b0887cce7efb046ea1fd9b74424e0eceb164
Author
Parents
Loading