use cpus_per_task=12 instead of 10 (#1281)
Summary:
Pull Request resolved: https://github.com/pytorch/benchmark/pull/1281
AWS cluster has machines with 96 cpus and 8 gpus, so 12 cpus per task
means we use all the CPUs.
I suspect that we were running into issues with bad CPU affinity when we
had 10 cpus_per_task, because often the 4th rank would be slower than
the other ranks.
Test Plan: Imported from OSS
Reviewed By: xuzhao9
Differential Revision: D41016736
Pulled By: davidberard98
fbshipit-source-id: 5a6c6f1cf59ae6de243f11e7ae099c4b92b881d2