[PyTorch] Stack-allocate boxed args for RecordFunction
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76266
Saving a heap allocation in this path improves performance.
Differential Revision: [D34090699](https://our.internmc.facebook.com/intern/diff/D34090699/)
Approved by: https://github.com/ezyang