pytorch
07c5cb8c - [Static Runtime] Optimize memory planner initialization (#64101)

Commit

3 years ago

[Static Runtime] Optimize memory planner initialization (#64101) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64101 Checking `getOutOfPlaceOperation(n)` is a very expensive operation, especially in multithreaded environments, due to a lock acquisition when the NNC cache is queried. This slows down the memory planner initialization time, and by extension, the latency for the first static runtime inference. There are two optimizations in this diff: * Cache the result of `p_node->has_out_variant()` to avoid the call to `getOutOfPlaceOperation`. This speeds up calls to `canReuseInputOutputs`, which in turn speeds up `isOptimizableContainerType` * Precompute all `isOptimizableContainerType` during static runtime initialization to avoid a pass over all of each node's inputs. Test Plan: All unit tests pass: `buck test caffe2/benchmarks/static_runtime/...` Reviewed By: movefast1990 Differential Revision: D30595579 fbshipit-source-id: 70aaa7af9589c739c672788bf662f711731864f2

References

#65112 - [LTC] Merge master

Author

Mike Iovine

Committer

facebook-github-bot

Parents

2d75ab0c

pytorch 07c5cb8c - [Static Runtime] Optimize memory planner initialization (#64101)

pytorch
07c5cb8c - [Static Runtime] Optimize memory planner initialization (#64101)