pytorch
d8c368bd - CPU-strided-complex support for compare and pointwise ops (#28735)

Commit
5 years ago
CPU-strided-complex support for compare and pointwise ops (#28735) Summary: In-tree changes to pytorch to support complex numbers are being submitted here. Out-of-tree support for complex numbers is here: [pytorch-cpu-strided-complex extension](https://gitlab.com/pytorch-complex/pytorch-cpu-strided-complex) These changes optimize complex Vec256 math kernels so that are within 2X real number performance on average. [Benchmarks are here](https://docs.google.com/spreadsheets/d/17pObcrSTpV4BOOX9FYf1vIX3QUlEgQhLvL1IBEyJyzs/edit#gid=0) Changes so far: - [x] Added complex support for eq, neq, max, and min ops. - max/min ops need to compare the absolute value for complex numbers (using zabs). - [x] Added complex support for is_nonzero and where. - where op compares the absolute value for complex numbers (using zabs). - [x] Added complex support for linear interp and and pointwise ops. - [x] Added complex support for check_convert and Linspace/Logspace. - std::complex does not support ++operator. - All compilers from clang, g++, c++ on aarch64, x86 produce the same assembly code when using `+=1' instead of `++`. [example for loop](https://godbolt.org/z/O6NW_p) - [x] Added complex support for log, log2, log10. - [x] Optimized Vec256 operators using various logarithmic identities. - `asin()`, `acos()`, `atan()` is optimized using a `ln()` identity. - `sqrt()` is optimized by splitting the computation into real and imag parts. - several `_mm256_mul_pd` are avoided by using `_mm256_xor_pd` ops instead. - [x] Added complex support for pow. - exp is cast to `std::complex<double>`. - no special optimization is added when the `exp` is real because the `std::pow()` operator expects a std::complex number. Pull Request resolved: https://github.com/pytorch/pytorch/pull/28735 Differential Revision: D18170691 Pulled By: ezyang fbshipit-source-id: 6f167398e112cdeab02fcfde8b543cb6629c865a
Author
Parents
Loading