pytorch
36d91b55 - Add differentiable mkldnn_rnn_layer_backward to support double backward of LSTM (#100627)

Commit

1 year ago

Add differentiable mkldnn_rnn_layer_backward to support double backward of LSTM (#100627) ### Description This PR is to fix #99413, which shows the limitation of double backward using oneDNN in LSTM. This PR does not implement double backward function itself, because that is pretty hard to spell out. Instead, it implements mkldnn_rnn_layer_backward using differentiable operations, so that double backward can be done automatically. During backward process, it needs to use gates and hidden states between cells during one layer. However, these middle variables are stored in the `workspace`, and it is hard to figure them out. Therefore, in backward, we need re-calculate them first. Corresponding UT has been added based on the failing case in # 99413. The UT with gradcheck and gradgradcheck which is added in https://github.com/pytorch/pytorch/pull/26660 cannot test LSTM using oneDNN, because UT only supports `double` datatype, while oneDNN does not support it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100627 Approved by: https://github.com/jgong5, https://github.com/soulitzer

Author

yanbing-j

Committer

pytorchmergebot

Parents

d261e43c

pytorch 36d91b55 - Add differentiable mkldnn_rnn_layer_backward to support double backward of LSTM (#100627)

pytorch
36d91b55 - Add differentiable mkldnn_rnn_layer_backward to support double backward of LSTM (#100627)