Creates "Versioned Symbol" pattern to preserve serialized Torchscript semantics (#36300)
Summary:
PyTorch users write programs and save them as serialized Torchscript. When this Torchscript is loaded it contains symbols like "aten::div" describing some of the program's behavior. If the behavior of these symbols has changed since the program was serialized, however, then the original program's semantics may not be preserved.
For example, when we make aten::div always perform "true" division, like NumPy, Python3, and JAX, then serialized Torchscript programs relying on aten::div performing floor division on integral inputs will break.
This PR demonstrates the "Versioned Symbol" pattern that lets symbols be remapped into Torchscript builtins that preserve their historic behavior. Using this pattern, after we update aten::div to always perform true division, we could remap it in older Torchscript programs to a builtin that implements its historic behavior.
The pattern is described in the [Versioned Symbols] note in the code and is implemented like this:
- If BuiltinModule is given a version, before it returns a symbol it queries to see if another symbol should be substituted for it.
- versioned_symbol.cpp has a map for symbols and the version range for which another symbol should be substituted for them.
- The substitutions are implemented as builtin functions.
An example using the new, test-only _subcmul function is implemented to test this behavior. A test in jit/test_save_load.py follows the pattern described in the [Versioned Symbols] note and uses a fixture serialized with file version 2 to verify that the historic behavior is preserved.
In the future we will likely need a slightly more complex mechanism with multiple substitutions with distinct version ranges, and this just requires changing the map to be Symbol->Iterable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36300
Differential Revision: D21058990
Pulled By: mruberry
fbshipit-source-id: 2b7c732878c0ecfcd9f0a6205fb6d6421feeaf61