Fix complex acos edge cases (#52287)
Summary:
Use `std::acos` even when avx2 is available
Add slow but accurate implementation of complex arc cosine based on
W. Kahan "Branch Cuts for Complex Elementary Functions" paper, where
cacos(z).re = 2*atan2(sqrt(1-z).re(), sqrt(1+z).re())
cacos(z).im = asinh((sqrt(conj(1+z))*sqrt(1-z)).im())
Fixes https://github.com/pytorch/pytorch/issues/42952
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52287
Reviewed By: walterddr
Differential Revision: D26455027
Pulled By: malfet
fbshipit-source-id: a81ce1ba4953eff4d3c2a265ef9199896a67b240