Disable (Broken) CUDAStreamVariable in dynamo (#100766)
While attempting to explore XLTransformers w/ PT2, I found that we leak tracing time objects (VariableTrackers) into the runtime:
```
Traceback (most recent call last):
File "/scratch/voz/work/xlformers/train.py", line 686, in <module>
main(cfg)
File "/scratch/voz/work/xlformers/train.py", line 357, in main
pred, _ = model(x)
File "/scratch/voz/work/pytorch/torch/nn/modules/module.py", line 1502, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/scratch/voz/work/pytorch/torch/nn/modules/module.py", line 1511, in _call_impl
return forward_call(*args, **kwargs)
File "/scratch/voz/work/pytorch/torch/_dynamo/eval_frame.py", line 282, in _fn
return fn(*args, **kwargs)
File "/data/home/voz/miniconda3/envs/torch5/lib/python3.10/site-packages/fairscale/nn/data_parallel/fully_sharded_data_parallel.py", line 1416, in forward
self._lazy_init()
File "/data/home/voz/miniconda3/envs/torch5/lib/python3.10/site-packages/fairscale/nn/data_parallel/fully_sharded_data_parallel.py", line 1424, in <resume in forward>
args, kwargs = cast_floats_to_right_precision(True, True, *args, **kwargs)
File "/data/home/voz/miniconda3/envs/torch5/lib/python3.10/site-packages/fairscale/nn/data_parallel/fully_sharded_data_parallel.py", line 1434, in <resume in forward>
self._rebuild_full_params()
File "/scratch/voz/work/pytorch/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/home/voz/miniconda3/envs/torch5/lib/python3.10/site-packages/fairscale/nn/data_parallel/fully_sharded_data_parallel.py", line 1932, in _rebuild_full_params
def update_p_data(custom_output_tensor: Optional[torch.Tensor] = None) -> None:
File "/data/home/voz/miniconda3/envs/torch5/lib/python3.10/site-packages/fairscale/nn/data_parallel/fully_sharded_data_parallel.py", line 1932, in <resume in _rebuild_full_params>
def update_p_data(custom_output_tensor: Optional[torch.Tensor] = None) -> None:
File "/scratch/voz/work/pytorch/torch/cuda/__init__.py", line 464, in __enter__
if self.src_prev_stream.device != cur_stream.device:
AttributeError: 'CUDAStreamVariable' object has no attribute 'device'
```
This indicates a serious bug.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100766
Approved by: https://github.com/ezyang