[RecordFunction] Store a c10::variant of name and schema rather then both. (#76017)
Summary: RecordFunction can be created either with a `c10::OperatorHandle` (from the dispatcher) or a string (everywhere else). We store a bunch of fields in RecordFunction to handle both paths, and it also complicates the logic since the input and output size can either be the size of `inputs_` and `outputs_` OR taken from the schema. (And that significantly complicates later changes) Because the dispatcher is the only place where we call the schema based method, we can just bind a reference and pass a `reference_wrapper`. (This is different from the other proposal that RecordFunction holds a schema pointer; in this case the caller just guarantees the schema for the lifetime of the guard.)
Test Plan: Ran the overhead benchmark. It helps quite a bit (0.37us -> 0.33us), presumably because there's just a lot less state (and assigns / dtors) in the guard.
Reviewed By: chaekit
Differential Revision: D35651041
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76017
Approved by: https://github.com/chaekit