Use grad_out for cudnn CTC loss (#27039)
Summary:
Using grad_out for CuDNN CTC loss fixes: https://github.com/pytorch/pytorch/issues/26797, https://github.com/pytorch/pytorch/issues/25833.
We also fix a cudnn incompatible change that surfaced during the testing: As of CuDNN 7.6 the semantics of the CTC loss gradients are different.
This leads us to disable CuDNN CTC for CuDNN < 7.6. To mitigate the impact on users, we convert the parameters for the native implementation if CuDNN isn't applicable (previously this would give an error.)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27039
Differential Revision: D17910815
Pulled By: ngimel
fbshipit-source-id: 465b33612d3402f10c355aa7026a7e1ffaef3073