SemanticDiff

pytorch
08df4c2b - slow_conv2d grad_input: avoid dispatch in parallel region (#65725)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

2 years ago

slow_conv2d grad_input: avoid dispatch in parallel region (#65725) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65725 See gh-56794 Avoid dispatch inside of parallel_for by: 1. Replacing Tensor slicing with TensorAccessor 2. Call `grad_input.zero_()` only once, outside of the parallel region 3. Replace `at::mm` with a `gemm` call Test Plan: Imported from OSS Reviewed By: saketh-are Differential Revision: D31257876 Pulled By: ngimel fbshipit-source-id: f2902edeccd161431c1dfb1ab3e165d039ec259d

References

#66449 - Merge pytorch master into lazy_tensor_staging

Author

peterbell10

peterbell10

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading