[PyTorch] Avoid double indirection in MaybeOwned's borrowed state (#55685)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55685
This diff introduces a traits class that tells `MaybeOwned` how
to borrow a specific type. While it is still capable of handling a
generic `T` by storing `const T*` and how to do so is shown in a
comment, it is not committed in live code because it is not needed.
Instead, we have specific traits implementations for
`c10::intrusive_ptr<T>` and `Tensor` that implement the borrowed state
as just a plain old `c10::intrusive_ptr<T>` or `Tensor` (respectively)
that we manipulate to avoid reference counting operations. We do this
entirely with public API to `c10::intrusive_ptr<T>` and could do
likewise with `Tensor`, but (as comments in the code explain) adding a
private constructor to `MaybeOwnedTraits<Tensor>` allowed additional
performance optimization.
This representation of `MaybeOwned` seems to be more efficient than
the generic `T-or-pointer-to-const-T` representation. Intuitively, we
avoid a double indirection at minimal cost vs the previous
implementation. It *also* seems to be more efficient than the pointer
tagging representation I sent out as #55555; apparently, having the
extra word for a flag is much cheaper than the masking operands for
pointer tagging and the same double indirection as the generic
representation.
In particular, this seems to have the same *effect* as the
`TensorHandle` idea we've discussed internally (a hypothetical class
like `Tensor` that wraps a raw `TensorImpl*` and shares the generated
methods of `Tensor` so that everything still works), but you have to
be explicit about borrowing and use pointer syntax to get the
effect. Unlike `TensorHandle`, you can use it as internal state in a
class and "upgrade" from a borrow to an owned `Tensor` derived from
your original borrow if necessary.
Note that this is just a representational change and it still has the
same semantics: you need to keep the T you borrowed from around!
ghstack-source-id: 126405920
Test Plan:
Previous diff changes the MaybeOwned tests to cover
both `intrusive_ptr` and `Tensor`, which we need in order to ensure
that our trait implementations are correct.
Further diffs in this stack will use this type to hold operand tensors
in `TensorIteratorBase` to allow borrowing at relatively small cost
(very roughly, a 6% win in the successful borrowing case for our
add-in-place benchmark at the cost of a 2.5% regression in the
legacy non-borrowing case, and we know that we will be able to borrow
in structured kernels and probably most unstructured operands as
well).
Reviewed By: ezyang
Differential Revision: D27679723
fbshipit-source-id: 57104f4edabc545ff83657233fde9eb40b969826