Optimize alias analysis (#20899)

Commit

5 years ago

Optimize alias analysis (#20899) Summary: # Overall Improvements 1. Switched from using `unordered_set` to sparse bitset. 1. Prevent some excessive memory allocations (thanks to resistor ) 1. Take advantage of the sparse bitset operations 1. Switch to `flat_hash_map` instead of `unordered_map` in some places. # Benchmarks (somewhat approximate, best of a couple runs) 1. InceptionNet (load + one forward pass): 19.8->13.3 1. GoogleNet(load + one forward pass): 10.0 -> 7.24 1. DenseNet (only load): 7.3 -> 5.3 I use the `sparse bitset` taken from https://llvm.org/doxygen/SparseBitVector_8h_source.html. I had to make some modifications to use `__builtin_popcountl` and instructions like that instead of other transitive clang dependencies. ## Some notes on our graph topologies In general, our graphs are very sparse, and most of the components aren't connected. For GoogleNet, we have 200k nodes, we do 2k `mayAlias` queries, and the sum of magnitudes of sets at each node is 500k (ie: every node, on average, reaches 2.5 leaves). PS: Holy crap macbooks throttle an insane amount with the default fan settings. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20899 Differential Revision: D15564612 Pulled By: Chillee fbshipit-source-id: 2a293a21a9be25f942ca888c8f225cab32bbfcd0

Author

Chillee

Committer

facebook-github-bot

Parents

31aefd9b

pytorch 41635764 - Optimize alias analysis (#20899)

Commit

pytorch
41635764 - Optimize alias analysis (#20899)