Shard distributed tests on non CUDA focal (#84891)
[pull / linux-focal-py3.7-gcc7 / test (distributed, 1, 1, linux.2xlarge)](https://hud.pytorch.org/tts/pytorch/pytorch/master?jobName=pull%20%2F%20linux-focal-py3.7-gcc7%20%2F%20test%20(distributed%2C%201%2C%201%2C%20linux.2xlarge)) p90 TTS is about 2.2 hours, 2x the default shards. This is non-CUDA common Linux runners, so we can simply add one more shard for distributed. I missed this change in https://github.com/pytorch/pytorch/pull/84430
### Testing
Having 2 shards with test time around 55m each:
* https://github.com/pytorch/pytorch/actions/runs/3040900328/jobs/4897576932
* https://github.com/pytorch/pytorch/actions/runs/3040900328/jobs/4897577014
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84891
Approved by: https://github.com/clee2000