Migrate uses of THCReduceApplyUtils to cuda_utils::BlockReduce (#64442)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64442
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision: D30735341
Pulled By: ngimel
fbshipit-source-id: 3cb58bed8f1f5aa32fd49fd37b10c8490bcc645a