Compute input gradient only if required (CUDA) (#66070)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66070
Test Plan: Imported from OSS
Reviewed By: dagitses
Differential Revision: D31431805
Pulled By: albanD
fbshipit-source-id: 8c3de6632aaee168ec6fd7eb79a5af26973af9c5