Fix DLPack CUDA stream convention (#67618)
Summary:
Apparently for the array API, cuda default stream and per thread stream should be 1 and 2 instead of 0 and 1:
https://data-apis.org/array-api/latest/API_specification/array_object.html?dlpack-self-stream-none#dlpack-self-stream-none.
This caused a problem in the interop with CuPy https://github.com/cupy/cupy/pull/5970#discussion_r739912926.
cc rgommers leofang mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67618
Reviewed By: albanD
Differential Revision: D32521805
Pulled By: mruberry
fbshipit-source-id: 95777e4014e5edf1f88ba10adc03c6e34c13248d