Compute Loss inside the training step. (#686)
* improved solutio
* compute loss fix
* esolved comments
* removed duplicated code .. used main trainer compute loss
* added --loss_in_train flag
* resolved comments
* resolved comments
* formatter usng latest black
* add import for code quality
* formatter usng latest black
* readding super loss compute
* resolv comments
* fix typo
* solve not exporting onnx models
* dictionary casting , bind method
* trainer fix with ruff
---------
Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>