[Distributed] Make xm.all_gather a single graph in Dynamo (#4922)
Summary:
This pull request makes xm.all_gather, the _all_gather_using_all_reduce path, a single graph in Dynamo. To do that, it:
1. removes a hardware type check, specialize CPU doesn't seem to be worth it.
2. caches ordinal and xrt_world_size.
Test Plan:
PJRT_DEVICE=TPU python test/test_mp_all_gather.py