[GPT-J] Use the `float16` checkpoints in integration tests (#13676)
* Use fp16 checkpoints
* Style
* Fix outputs and disable OOM tests
* Correct another output
* Use a random smaller model for generation tests
* repo quickfix
* fix gradient checkpointing