pytorch
5c252f2c - [Inductor/cpp] Fix reduction on pre clang-10 (#103347)

Commit
1 year ago
[Inductor/cpp] Fix reduction on pre clang-10 (#103347) `#pragma omp declare reduction` is not supported before clang-10 and results in a misleading compiler error in the following example: ```c++ template<typename T> T max_propagate_nan(T, T); extern "C" void cpp_fused_argmax_max_sum_0(const float* in_ptr0, float* out_ptr0, float* out_ptr1, long* out_ptr2) { float tmp_acc0 = 0; float tmp_acc1 = -std::numeric_limits<float>::infinity(); float tmp_acc2 = std::numeric_limits<float>::infinity(); struct IndexValue_7 {size_t index; float value;}; IndexValue_7 tmp_acc3{0, -std::numeric_limits<float>::infinity()}; #pragma omp declare reduction(argmax : IndexValue_7 : omp_out.value = omp_in.value < omp_out.value ? omp_out.value : omp_in.value, omp_out.index = omp_in.value < omp_out.value ? omp_out.index : omp_in.index) initializer(omp_priv = {0, -std::numeric_limits<float>::infinity()}) for(long i0=static_cast<long>(0L); i0<static_cast<long>(3L); i0+=static_cast<long>(1L)) { auto tmp0 = in_ptr0[static_cast<long>(i0)]; tmp_acc0 = tmp_acc0 + tmp0; tmp_acc1 = max_propagate_nan(tmp_acc1, tmp0); if (tmp_acc3.value < tmp0) { tmp_acc3.index = i0; tmp_acc3.value = tmp0; } } out_ptr0[static_cast<long>(0L)] = tmp_acc0; out_ptr1[static_cast<long>(0L)] = tmp_acc1; out_ptr2[static_cast<long>(0L)] = tmp_acc3.index; } ``` ``` % clang++-10 -std=c++17 -fopenmp bar.cpp -c -O3 % clang++-9 -std=c++17 -fopenmp bar.cpp -c -O3 bar.cpp:17:149: error: expected ')' #pragma omp declare reduction(argmax : IndexValue_7 : omp_out.value = omp_in.value < omp_out.value ? omp_out.value : omp_in.value, omp_out.index = omp_in.value < omp_out.value ? omp_out.index : omp_in.index) initializer(omp_priv = {0, -std::numeric_limits<float>::infinity()}) ^ bar.cpp:17:34: note: to match this '(' #pragma omp declare reduction(argmax : IndexValue_7 : omp_out.value = omp_in.value < omp_out.value ? omp_out.value : omp_in.value, omp_out.index = omp_in.value < omp_out.value ? omp_out.index : omp_in.index) initializer(omp_priv = {0, -std::numeric_limits<float>::infinity()}) ^ 1 error generated. ``` Also, remove unnecessary `struct` keyword in front of type, as C++ compiler already assumes that (and again, it causes problem with clang++-10 implementation) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103347 Approved by: https://github.com/voznesenskym
Author
Committer
Parents
Loading