[easy] ThroughputBenchmark: print out aten's parallel settings before execution (#35632)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35632
This is handy to make sure the settings you have match your expectations. Here is an example output I have got:
```
I0328 15:55:12.336715 41258 throughput_benchmark-inl.h:23] ATen/Parallel:
at::get_num_threads() : 1
at::get_num_interop_threads() : 14
OpenMP 201511 (a.k.a. OpenMP 4.5)
omp_get_max_threads() : 1
Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
mkl_get_max_threads() : 1
Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
std::thread::hardware_concurrency() : 28
Environment variables:
OMP_NUM_THREADS : 1
MKL_NUM_THREADS : [not set]
ATen parallel backend: OpenMP
```
Test Plan: Imported from OSS
Differential Revision: D20731331
Pulled By: ezyang
fbshipit-source-id: 5be7ffb23db49b1771c2f563b5d84180c3a0ba7f