Fix gradient checkpointing imagegpt (#21816)
* Fix gradient checkpointing bug in gptneox
* Fix gradient checkpointing bug in modeling_imagegpt.py
* Revert gpt neox changes
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>