pytorch
95a1725a - Vsx initial support issue27678 (#41541)

Commit View On GitHub

Commit

3 years ago

Vsx initial support issue27678 (#41541) Summary: ### Pytorch Vec256 ppc64le support implemented types: - double - float - int16 - int32 - int64 - qint32 - qint8 - quint8 - complex_float - complex_double Notes: All basic vector operations are implemented: There are a few problems: - minimum maximum nan propagation for ppc64le is missing and was not checked - complex multiplication, division, sqrt, abs are implemented as PyTorch x86. they can overflow and have precision problems than std ones. That's why they were either excluded or tested in smaller domain range - precisions of the implemented float math functions ~~Besides, I added CPU_CAPABILITY for power. but as because of quantization errors for DEFAULT I had to undef and use vsx for DEFAULT too~~ #### Details ##### Supported math functions + plus sign means vectorized, - minus sign means missing, (implementation notes are added inside braces) (notes). Example: -(both ) means it was also missing on x86 side g( func_name) means vectorization is using func_name sleef - redirected to the Sleef unsupported function_name | float | double | complex float | complex double |-- | -- | -- | -- | --| acos | sleef | sleef | f(asin) | f(asin) asin | sleef | sleef | +(pytorch impl) | +(pytorch impl) atan | sleef | sleef | f(log) | f(log) atan2 | sleef | sleef | unsupported | unsupported cos | +((ppc64le:avx_mathfun) ) | sleef | -(both) | -(both) cosh | f(exp) | -(both) | -(both) | erf | sleef | sleef | unsupported | unsupported erfc | sleef | sleef | unsupported | unsupported erfinv | - (both) | - (both) | unsupported | unsupported exp | + | sleef | - (x86:f()) | - (x86:f()) expm1 | f(exp) | sleef | unsupported | unsupported lgamma | sleef | sleef | | log | + | sleef | -(both) | -(both) log10 | f(log) | sleef | f(log) | f(log) log1p | f(log) | sleef | unsupported | unsupported log2 | f(log) | sleef | f(log) | f(log) pow | + f(exp) | sleef | -(both) | -(both) sin | +((ppc64le:avx_mathfun) ) | sleef | -(both) | -(both) sinh | f(exp) | sleef | -(both) | -(both) tan | sleef | sleef | -(both) | -(both) tanh | f(exp) | sleef | -(both) | -(both) hypot | sleef | sleef | -(both) | -(both) nextafter | sleef | sleef | -(both) | -(both) fmod | sleef | sleef | -(both) | -(both) [Vec256 Test cases Pr https://github.com/pytorch/pytorch/issues/42685](https://github.com/pytorch/pytorch/pull/42685) Current list: - [x] Blends - [x] Memory: UnAlignedLoadStore - [x] Arithmetics: Plus,Minu,Multiplication,Division - [x] Bitwise: BitAnd, BitOr, BitXor - [x] Comparison: Equal, NotEqual, Greater, Less, GreaterEqual, LessEqual - [x] MinMax: Minimum, Maximum, ClampMin, ClampMax, Clamp - [x] SignManipulation: Absolute, Negate - [x] Interleave: Interleave, DeInterleave - [x] Rounding: Round, Ceil, Floor, Trunc - [x] Mask: ZeroMask - [x] SqrtAndReciprocal: Sqrt, RSqrt, Reciprocal - [x] Trigonometric: Sin, Cos, Tan - [x] Hyperbolic: Tanh, Sinh, Cosh - [x] InverseTrigonometric: Asin, ACos, ATan, ATan2 - [x] Logarithm: Log, Log2, Log10, Log1p - [x] Exponents: Exp, Expm1 - [x] ErrorFunctions: Erf, Erfc, Erfinv - [x] Pow: Pow - [x] LGamma: LGamma - [x] Quantization: quantize, dequantize, requantize_from_int - [x] Quantization: widening_subtract, relu, relu6 Missing: - [ ] Constructors, initializations - [ ] Conversion , Cast - [ ] Additional: imag, conj, angle (note: imag and conj only checked for float complex) #### Notes on tests and testing framework - some math functions are tested within domain range - mostly testing framework randomly tests against std implementation within the domain or within the implementation domain for some math functions. - some functions are tested against the local version. ~~For example, std::round and vector version of round differs. so it was tested against the local version~~ - round was tested against pytorch at::native::round_impl. ~~for double type on **Vsx vec_round failed for (even)+0 .5 values**~~ . it was solved by using vec_rint - ~~**complex types are not tested**~~ **After enabling complex testing due to precision and domain some of the complex functions failed for vsx and x86 avx as well. I will either test it against local implementation or check within the accepted domain** - ~~quantizations are not tested~~ Added tests for quantizing, dequantize, requantize_from_int, relu, relu6, widening_subtract functions - the testing framework should be improved further - ~~For now `-DBUILD_MOBILE_TEST=ON `will be used for Vec256Test too~~ Vec256 Test cases will be built for each CPU_CAPABILITY Pull Request resolved: https://github.com/pytorch/pytorch/pull/41541 Reviewed By: zhangguanheng66 Differential Revision: D23922049 Pulled By: VitalyFedyunin fbshipit-source-id: bca25110afccecbb362cea57c705f3ce02f26098

Author

quickwritereader

Committer

facebook-github-bot

Parents

a3e1bd1f

pytorch 95a1725a - Vsx initial support issue27678 (#41541)

Commit

pytorch
95a1725a - Vsx initial support issue27678 (#41541)