pytorch
8b54b14f - [Static Runtime] Added a cache for NNC generated code across different calls to the same ops (#62921)

Commit
3 years ago
[Static Runtime] Added a cache for NNC generated code across different calls to the same ops (#62921) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62921 Added a cache for NNC generated code across different calls to the same ops. Before this diff: ``` ProcessedNode time 13402.9 ms Static Module initialization took 30964.8 ms ``` After this diff: ``` ProcessedNode time 85.4195 ms Static Module initialization took 4348.42 ms ``` There is one global cache for all the ops. It is guarded with a reader-writer lock. This is necessary because we could have multiple threads loading different models in parallel. Note that this locking does not guarantee that there will be exactly one code generated for each op. There could be more than one thread generating code for the same op simultaneously and all of them will update the cache in some order. But that should be small number bounded by the number of threads. Also, there is no correctness issue, since the generated code is always the same and the one generated by the last thread is retained in the cache and reused later while running the model. Test Plan: Tested inline_cvr model Reviewed By: hlu1 Differential Revision: D30104017 fbshipit-source-id: 32e9af43d7e724ed54b661dfe58a73a14e443ff7
Author
Parents
Loading