pytorch
1d7cf4b2 - Reduce overhead when Future invokes callbacks inline (#57638)

Commit

3 years ago

Reduce overhead when Future invokes callbacks inline (#57638) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57638 In RPC there are a few instances of "fastpaths" which do `if (fut.isCompleted()) { do_sth(); } else { fut.addCallback(do_sth); }`. I intend to get rid of them, for reasons I'll clarify later but which in a nutshell have to do with CUDA correctness and readability. Note that dropping the fastpath introduces no change in behavior (because `addCallback` invokes the callback inline anyways), thus the only perf concern comes from the fact that the fastpath avoids constructing and passing around a `std::function`. I don't think this is a significant performance hit. Regardless, this PR preemptively addresses this concern, by tweaking `addCallback` (and similar methods) so they can handle raw lambdas, and so that they do _not_ wrap them into `std::function`s if they are invoked inline. ghstack-source-id: 129567067 Test Plan: CI Reviewed By: mrshenli Differential Revision: D28222808 fbshipit-source-id: eb1c7114cf7aca3403cb708f14287cab0907ecfa

Author

Committer

facebook-github-bot

Parents

ce2f1c29

pytorch 1d7cf4b2 - Reduce overhead when Future invokes callbacks inline (#57638)

pytorch
1d7cf4b2 - Reduce overhead when Future invokes callbacks inline (#57638)