pytorch
6697f1e4 - Update shallow_copy_and_detach for nested tensor impls to enable nested tensor softmax backward (#81838)

Commit
2 years ago
Update shallow_copy_and_detach for nested tensor impls to enable nested tensor softmax backward (#81838) # Summary This change fixes a bug that was encountered when trying to add more backward formulas for nested tensor ops. If a derivative is defined that stores the "result" for use in the backward the output of the forward op is saved using: ``` if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } ``` SavedVariable calls a series of functions which in turn calls shallow_copy_and_detach and when https://github.com/pytorch/pytorch/blob/c179597753f20089056115542d3b9c74ab8b864c/c10/core/TensorImpl.cpp#L533 is hit this calls sizes_custom() which is not implemented and errors. I also noticed that since the storage format is different for nested_tensor not `storage_ ` but instead two tensors that the we should actually be calling the NestedTensorImpl constructor. This PR overrides shallow_copy_and_detach from the derived class and ensures that shallow copy works correctly. ## Update - Added the softmax derivative in this PR because that is a direct use case that was blocked by not having shallow_copy_and_detach work correctly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81838 Approved by: https://github.com/soulitzer
Author
Committer
Parents
Loading