Support multi-output models (#170)
* Push to remote
* Correctly handle multi output models by doing loss scaling in backward()
Unit tests for multi output models
* Fix formatting issues
* Formatting issues fix
* Fix formatting
* Update DeepSpeedExamples submodule
Enable Megatron model tests