Add --local-ranks-filter to torchrun: allow logs filtering by rank (#118562)
Addresses issue https://github.com/pytorch/pytorch/issues/117383
The implementation exposes `--local-ranks-filter` which filters by rank which files we pass to `TailLog` (used in torchrun to determine which logs to output to stdout/stderr)
## Behavior
### with --tee
Currently --tee is implemented as --redirect to file, and streams file to console using `tail`. When --tee is specified, file logs will be unaffected and we will only filter the output to console.
### with --redirect
When --redirect is specified without --tee, nothing is logged to console, so we no-op.
### with neither
When neither --tee or --redirect are specified, torchrun uses empty string "" to indicate logging to console. We intercept this empty string, and redirect it to "/dev/null" to not print to console.
The api also allows a per-rank configuration for --tee and --redirect, and is also supported by this filter implementation.
## Usage
### without --tee
```
> TORCH_LOGS_FORMAT="%(levelname)s: %(message)s" TORCH_LOGS="graph" torchrun --standalone --nproc_per_node=2 --role rank --local_rank_filter=0 t.py
hello from rank 0 python
DEBUG: TRACED GRAPH
__compiled_fn_0 <eval_with_key>.0 opcode name target args kwargs
------------- ------ ----------------------- --------- --------
placeholder l_x_ L_x_ () {}
call_function mul <built-in function mul> (l_x_, 5) {}
output output output ((mul,),) {}
...
```
### with --tee
```
> TORCH_LOGS_FORMAT="%(levelname)s: %(message)s" TORCH_LOGS="graph" torchrun --standalone --nproc_per_node=2 --role rank --tee 3 --local_rank_filter=0 t.py
[rank0]:hello from rank 0 python
[rank0]:DEBUG: TRACED GRAPH
[rank0]: __compiled_fn_0 <eval_with_key>.0 opcode name target args kwargs
[rank0]:------------- ------ ----------------------- --------- --------
[rank0]:placeholder l_x_ L_x_ () {}
[rank0]:call_function mul <built-in function mul> (l_x_, 5) {}
[rank0]:output output output ((mul,),) {}
...
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118562
Approved by: https://github.com/wconstab, https://github.com/wanchaol