pytorch
ec159429 - remove unnecessary __syncthreads() in conv_depthwise2d_grad_weight_kernel (#84854)

Commit
2 years ago
remove unnecessary __syncthreads() in conv_depthwise2d_grad_weight_kernel (#84854) Threads within a thread block would be synchronize inside the function BlockReduceSum when intra-warp reduce finishes. It's unnessary to synchronize threads before invoking function BlockReduceSum. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84854 Approved by: https://github.com/ngimel
Author
Committer
Parents
Loading