[Inductor] Fix OpenMP discovery on MacOS (#93895)
It's not available as system dependency, so assume that it is installed
using Anaconda
Also, clang on MacOS does not recognize `-fopenmp` flag, but according
to https://mac.r-project.org/openmp/ and local experiments `-Xclang
-fopenmp` always works
Test plan:
Following should run and return true
```python
import torch
def foo(x: torch.Tensor) -> torch.Tensor:
return torch.sin(x) + torch.cos(x)
if __name__=="__main__":
x = torch.rand(3, 3)
x_eager = foo(x)
x_pt2 = torch.compile(foo)(x)
print(torch.allclose(x_eager, x_pt2))
```
Skip number of tests that fail on x86 MacOS (for example rsqrt for bool type and `test_pixel_shuffle_channels_last_cpu` on machines that do not support AVX2)
Tweak few tests to use double precision when running on CPU, as type promotion for accumulator types is broken.
TODO: Fix PyTorch for M1 compilation with OpenMP, bundle `omp.h` into the package and use it instead.
Fixes https://github.com/pytorch/pytorch/issues/90362
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93895
Approved by: https://github.com/jansel, https://github.com/jgong5