pytorch
ab2a9ab9 - Non-blocking SyncBatchNorm update (#36659)

Commit View On GitHub

Commit

4 years ago

Non-blocking SyncBatchNorm update (#36659) Summary: As shown in https://github.com/pytorch/pytorch/issues/36452 , SyncBatchNorm can block host thread due the ``MemcpyDtoH`` and ``MemcpyHtoD`` when dealing with argument ``counts`` for native function ``batch_norm_gather_stats_with_counts``. - This fix change signiture of ``batch_norm_gather_stats_with_counts`` to ```c++ std::tuple<Tensor, Tensor> batch_norm_gather_stats_with_counts_cuda(const Tensor& self, const Tensor& mean, const Tensor& invstd, const Tensor& running_mean, const Tensor& running_var, double momentum, double epsilon, const Tensor& counts) ``` so it can directly receive "counts" in a ``CUDATensor`` rather than ``IntArrayRef`` whose data is in host memory. - This fix also improve implementation of ``SyncBatchNorm`` function so the construction of ``counts`` tensor will not cause additional ``MemcpyHtoD``, which will block host thread, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36659 Differential Revision: D21196991 Pulled By: ngimel fbshipit-source-id: 84a529e6cf22e03618fecbb8f070ec452f81229e

Author

winggan

Committer

facebook-github-bot

Parents

f11df2d2

pytorch ab2a9ab9 - Non-blocking SyncBatchNorm update (#36659)

Commit

pytorch
ab2a9ab9 - Non-blocking SyncBatchNorm update (#36659)