[Trainer] use output.loss when using liger-kernel (#42444)
* use output.loss when using liger
Handle loss computation for models using Liger-kernel.
fixes #42414
* Clarify Liger-kernel loss computation in comments
* Both standard transformers and Liger models handle shift_labels correctly via **kwargs
* removed unused shift_labels reference in loss computation
* Remove unused model unwrapping