Add torch_empty_cache_steps to TrainingArguments (#31546)
* Add torch_empty_cache_steps to TrainingArguments
* Fix formatting
* Add torch_empty_cache_steps to docs on single gpu training
* Remove check for torch_empty_cache_steps <= max_steps
* Captalize Tip
* Be device agnostic
* Fix linting