[profiler] Add Linux Perf support (#87866)
* Add support to use Linux kernel perf subsystem via the profiler.
* For now the perf configurability is quite limited to just event names. Threading etc. to come later.
* Given we want to support variety of different cpu types, number of events list (in addition to the standard set of events) is also limited.
* Rather than failing with unsupported feature for non-Linux platforms, it returns zeros for all the event counts.
* For now, max event counts is capped at 4, time multiplexing is not allowed.
* Threadpool recreate hack is restricted to mobile only - need to add better support for threading in general
Differential Revision: [D40238033](https://our.internmc.facebook.com/intern/diff/D40238033/)
**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40238033/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87866
Approved by: https://github.com/SS-JIA