Optimize some redunction operators on CPU BFloat16 (#55202)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55202
Test Plan: Imported from OSS
Reviewed By: anjali411
Differential Revision: D28836790
Pulled By: VitalyFedyunin
fbshipit-source-id: f3a29633d85eb5a614652e568140e9b19509f959