optimize clip_grad_norm_ function (#4915)
Optimize clip_grad_norm_ function by removing .item() calls to reduce
wait time for the device on the host.
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>