SemanticDiff

pytorch
f8e9e7ad - Allocating warp to an input index in compute_cuda_kernel (#43354)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

4 years ago

Allocating warp to an input index in compute_cuda_kernel (#43354) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43354 Instead of assigning a thread to an input index for repeating that index, we assign a warp to an index. This helps us in avoiding the costly uncoaelesced memory accesses and brach divergence which occur when each thread is repeating the index. Test Plan: Run trainer to test Reviewed By: ngimel Differential Revision: D23230917 fbshipit-source-id: 731e912c844f1d859b0384fcaebafe69cb4ab56a

Author

kiranmatam

kiranmatam

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading