API for obtaining global gradient norm (#1292)
* FP16 fused and unfused grad norm query.
* API for obtaining global unclipped gradient norm across parameter groups
* Use global norm not group norms
Co-authored-by: Shaden Smith <shaden.smith@microsoft.com>