[PyTorch] Optimzize mergeRunCallbacks for RecordFunction (#68387)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68387
Function call overhead on tryRunCallback was notable.
ghstack-source-id: 144235788
Test Plan:
Run //caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads:cpp_benchmark before/after this diff with arguemnts `--stressTestRecordFunction --op empty`.
Before: P467891339
After: P467891381
Reviewed By: chaekit
Differential Revision: D32443863
fbshipit-source-id: c0b3dd40bbd5bca976c2ebb0f21aa62e097b302e