pytorch
20374c99 - slow_conv2d_forward: avoid calling dispatcher in parallel region (#65724)

Commit

3 years ago

slow_conv2d_forward: avoid calling dispatcher in parallel region (#65724) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65724 See gh-56794 Avoid dispatch inside of parallel_for by: 1. Replacing Tensor slicing with TensorAccessor 2. Copy bias into output only once, outside of the parallel region 3. Replaces `addmm`_ with a direct call to gemm. Technically this also adds a new requirement that the output always be contiguous, but the out argument version isn't exposed or used anywhere in the `torch.nn` API. So that should be fine. Test Plan: Imported from OSS Reviewed By: saketh-are Differential Revision: D31257875 Pulled By: ngimel fbshipit-source-id: 84d2b39e7f65334bdfcc2c4719f93ee3c514ca32

References

#66449 - Merge pytorch master into lazy_tensor_staging

Author

peterbell10

Committer

facebook-github-bot

Parents

7191dd26

pytorch 20374c99 - slow_conv2d_forward: avoid calling dispatcher in parallel region (#65724)

pytorch
20374c99 - slow_conv2d_forward: avoid calling dispatcher in parallel region (#65724)