Python Jiterator supports multiple outputs (#78139)
This PR is part3.
Part1: https://github.com/pytorch/pytorch/pull/77902
Part2: https://github.com/pytorch/pytorch/pull/77921
Python Jiterator now supports returning multiple outputs
```
fn = torch.cuda.jiterator._create_multi_output_jit_fn(
"""
template <typename T>
T binary_2outputs(T i0, T i1, T& out0, T& out1) {
out0 = i0 + i1;
out1 = i0 - i1;
}
""",
num_outputs=2)
x = torch.rand(3, device='cuda')
y = torch.rand(3, device='cuda')
out0, out1 = fn(x, y)
torch.allclose(out0, x+y)
torch.allclose(out1, x-y)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78139
Approved by: https://github.com/ngimel