Add bitwise distributed reduction ops (#26824)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26824
These ops are named after the bitwise reduction ops in MPI.
This is based on the work done by knottb in #22449.
Closes #22449.
Test Plan: Imported from OSS
Differential Revision: D17600210
Pulled By: pietern
fbshipit-source-id: 44c7041ce01bc5de170a4591c5a696e4f24431ef