Use ProcessPoolExecutor in the ufmt adapter (#106123)
When running on a host with multiple CPUs, the ufmt linter was not able to use them very effectively. The biggest single culprit seems to be debug logging inside blib2to3 trying to acquire a lock, but disabling that doesn't help much - I suppose this must be GIL contention. Changing to a ProcessPoolExecutor makes it much faster.
The following timings are on a PaperSpace GPU+ instance with 8 vCPUs (the cores show up as Intel(R) Xeon(R) CPU E5-2623 v4 @ 2.60GHz but I'm not entirely clear if those are shared with other instances).
On main:
```
$ time lintrunner --all-files --take UFMT
ok No lint issues.
real 7m46.140s
user 8m0.828s
sys 0m5.446s
```
On this branch:
```
$ time lintrunner --all-files --take UFMT
ok No lint issues.
real 1m7.255s
user 8m13.388s
sys 0m3.506s
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106123
Approved by: https://github.com/ezyang