Keep loss_scale and Whole Loss Subgraph in FP32 during Mixed Precision Training (#4268)
* Keep loss subgraph as FP32 when mixed-p training.
* Fix case where there is no white-list loss op.
* Get nodes from loss_scale instead of whitelist.
* rename const variables.
Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>