[tsm] add support for jetter to Role (base_image) for mast launches
Summary:
1. Adds `ml_image` buck macro
2. Adds `--run_path` option to `torch.distributed.run`
3. Adds `tsm/driver/fb/test/patched/foo` (for unittesting)
4. Changes to `distributed_sum` to use `ml_image` (see Test plan for how this was tested in local and mast)
NOTE: need to enable jetter for flow and local schedulers (will do this on a separate diff since this diff is already really big)
Test Plan:
## Local Testing
```
# build the two fbpkgs (base and main)
buck run //pytorch/elastic/examples/distributed_sum/fb:torchx.examples.dist_sum.base
buck run //pytorch/elastic/examples/distributed_sum/fb:torchx.examples.dist_sum
# fetch the fbpkgs
cd ~/tmp
fbpkg fetch --symlink-tags -o -d . jetter:prod
fbpkg fetch --symlink-tags -o -d . torchx.examples.dist_sum.base
fbpkg fetch --symlink-tags -o -d . torchx.examples.dist_sum
jetter/LAST/jetter apply-and-run \
torchx.examples.dist_sum.base/LAST/torchrun \
torchx.examples.dist_sum/LAST \
-- \
--as_function \
--rdzv_id foobar \
--nnodes 1 \
--nproc_per_node 2 \
--max_restarts 0 \
--role worker \
--no_python \
~/torchx.examples.dist_sum/LAST/pytorch/elastic/examples/distributed_sum/fb/main.py
```
## Mast Testing
```
buck-out/gen/pytorch/elastic/torchelastic/tsm/fb/cli/tsm.par run_ddp \
--scheduler mast
--base_fbpkg torchx.examples.dist_sum.base:78f01b5 \
--fbpkg torchx.examples.dist_sum:f38ab46 \
--run_cfg hpcClusterUuid=MastNaoTestCluster,hpcIdentity=pytorch_r2p,hpcJobOncall=pytorch_r2p \
--nnodes 2 \
--resource T1 \
--nproc_per_node 4 \
--name kiuk_jetter_test \
pytorch/elastic/examples/distributed_sum/fb/main.py
```
Runs successfully: https://www.internalfb.com/mast/job/tsm_kiuk-kiuk_jetter_test_34c9f0fa?
Reviewed By: tierex, yifuwang
Differential Revision: D28177553
fbshipit-source-id: 29daada4bc26e5ef0949bf75215f35e557bd35b8