pytorch
e155e752 - MaxUnpooling: parallel_for not always backed by OMP (#65655)

Commit

3 years ago

MaxUnpooling: parallel_for not always backed by OMP (#65655) Summary: Use `c10::optional` + thread_fence instead of `#pragma omp critical` inside max_unpooling kernels Using any OpenMP pragma in `at::parallel_for` body is wrong, as it can be implemented using native treading algorithms such as ptrheads `c10::optional` sounds like a much better approach to pair of `has_error` and `error_index` variables. Use `std::atomic_thread_fence` to ensure error_index value is synchronized. It also fixes ICE reported in https://github.com/pytorch/pytorch/issues/65578 Pull Request resolved: https://github.com/pytorch/pytorch/pull/65655 Reviewed By: ngimel Differential Revision: D31206501 Pulled By: malfet fbshipit-source-id: 93df34530e721777b69509cd6c68f5d713fb2b2a

References

#66449 - Merge pytorch master into lazy_tensor_staging

Author

malfet

Committer

facebook-github-bot

Parents

26e31f76

pytorch e155e752 - MaxUnpooling: parallel_for not always backed by OMP (#65655)

pytorch
e155e752 - MaxUnpooling: parallel_for not always backed by OMP (#65655)