[ROCm] Enable BFloat16 type for loss functions and few misc ops required for resnet50 (#34469)
Summary:
This PR enables bfloat16 type for loss criterion ops(and the ops they depend on) and few miscellaneous ops required to train resnet50.
iotamudelta ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34469
Differential Revision: D20348856
Pulled By: ezyang
fbshipit-source-id: 0a8f06c2169cfa3c9cf319120e27150170095f6c