Fix dynamo issue (#6527)

Commit

1 year ago

Fix dynamo issue (#6527) Dynamo use faketensor to trace tensor ops. In some case, the mechanism break compiling with deepspeed. An example could be found at https://gist.github.com/oraluben/9b8240c2fe482eb4382453d6c97a5f76, to see issues, install deepspeed==0.14.4 instead of my fork without this PR, llama cannot be compiled. Detailed explanation: 1. `ZeROOrderedDict` dynamo use deepcopy to copy tensors, which will call `object.__reduce__`. When copying `ZeROOrderedDict`, the default implementation do not copy its `_parent_module` and will lead to failure. 2. `param` maybe faketensor and do not have `ds_status` yet, but during tracing it's ok to just skip the `register_external_parameter`, it should be done ways before. --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com>

References

#6527 - Fix dynamo issue

Author

oraluben

Parents

6e6563d3

DeepSpeed 3d5cf739 - Fix dynamo issue (#6527)

DeepSpeed
3d5cf739 - Fix dynamo issue (#6527)