[Dist Debugality] Log key DDP metrics to stderr under debug mode. (#52957)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52957
This diff:
1. Under TORCH_DISTRIBUTED_DEBUG=INFO or DETAIL, logs DDP information during init time (all stats in ddp_logging_data_)
2. Under TORCH_DISTRIBUTED_DEBUG=DETAIL, logs runtime stats when they are collected (first 10 iterations and then once every 100 iterations). Avoiding logging every iteration to not spam logs.
Verified by inspecting logs:
```
I0226 19:12:47.109243 2818475 logger.cpp:140] [Rank 1]: DDP Initialized with:
world_size: 2 module_name: Linear device_ids: 1 output_device: 1 backend_name: nccl parameter_dtype: float total
_parameter_size_in_bytes: 40 num_parameter_tensors: 2 bucket_sizes: 40 CUDA_VISIBLE_DEVICES: N/Abroadcast_buffer
s: 1 bucket_cap_mb: 25 find_unused_parameters: 0 gradient_as_bucket_view: 0
Backend Info: nccl_socket_ifname: N/A nccl_blocking_wait: N/A nccl_debug: WARN nccl_nthreads: N/A nccl_ib_timeo
ut: N/A
I0226 19:12:47.109252 2818473 logger.cpp:140] [Rank 0]: DDP Initialized with:
world_size: 2 module_name: Linear device_ids: 0 output_device: 0 backend_name: nccl parameter_dtype: float total
_parameter_size_in_bytes: 40 num_parameter_tensors: 2 bucket_sizes: 40 CUDA_VISIBLE_DEVICES: N/Abroadcast_buffer
s: 1 bucket_cap_mb: 25 find_unused_parameters: 0 gradient_as_bucket_view: 0
Backend Info: nccl_socket_ifname: N/A nccl_blocking_wait: N/A nccl_debug: WARN nccl_nthreads: N/A nccl_ib_timeo
ut: N/A
```
```
I0226 19:12:48.117936 2818473 logger.cpp:286] [Rank 0 / 2] Training Linear unused_parameter_size=0
Avg forward compute time: 568944
Avg backward compute time: 885504
Avg backward comm. time: 692496
Avg backward comm/comp overlap time: 113536
I0226 19:12:48.118517 2818475 logger.cpp:286] [Rank 1 / 2] Training Linear unused_parameter_size=0
Avg forward compute time: 565584
Avg backward compute time: 876992
Avg backward comm. time: 201872
Avg backward comm/comp overlap time: 128624
```
ghstack-source-id: 123171875
Test Plan: CI
Reviewed By: zhaojuanmao
Differential Revision: D26708184
fbshipit-source-id: 16defd5610d28bc4cf3fc2a0cc564e84efcfa791