pytorch
d55aad1f - Disable (Broken) CUDAStreamVariable in dynamo (#100766)

Commit
1 year ago
Disable (Broken) CUDAStreamVariable in dynamo (#100766) While attempting to explore XLTransformers w/ PT2, I found that we leak tracing time objects (VariableTrackers) into the runtime: ``` Traceback (most recent call last): File "/scratch/voz/work/xlformers/train.py", line 686, in <module> main(cfg) File "/scratch/voz/work/xlformers/train.py", line 357, in main pred, _ = model(x) File "/scratch/voz/work/pytorch/torch/nn/modules/module.py", line 1502, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/scratch/voz/work/pytorch/torch/nn/modules/module.py", line 1511, in _call_impl return forward_call(*args, **kwargs) File "/scratch/voz/work/pytorch/torch/_dynamo/eval_frame.py", line 282, in _fn return fn(*args, **kwargs) File "/data/home/voz/miniconda3/envs/torch5/lib/python3.10/site-packages/fairscale/nn/data_parallel/fully_sharded_data_parallel.py", line 1416, in forward self._lazy_init() File "/data/home/voz/miniconda3/envs/torch5/lib/python3.10/site-packages/fairscale/nn/data_parallel/fully_sharded_data_parallel.py", line 1424, in <resume in forward> args, kwargs = cast_floats_to_right_precision(True, True, *args, **kwargs) File "/data/home/voz/miniconda3/envs/torch5/lib/python3.10/site-packages/fairscale/nn/data_parallel/fully_sharded_data_parallel.py", line 1434, in <resume in forward> self._rebuild_full_params() File "/scratch/voz/work/pytorch/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/data/home/voz/miniconda3/envs/torch5/lib/python3.10/site-packages/fairscale/nn/data_parallel/fully_sharded_data_parallel.py", line 1932, in _rebuild_full_params def update_p_data(custom_output_tensor: Optional[torch.Tensor] = None) -> None: File "/data/home/voz/miniconda3/envs/torch5/lib/python3.10/site-packages/fairscale/nn/data_parallel/fully_sharded_data_parallel.py", line 1932, in <resume in _rebuild_full_params> def update_p_data(custom_output_tensor: Optional[torch.Tensor] = None) -> None: File "/scratch/voz/work/pytorch/torch/cuda/__init__.py", line 464, in __enter__ if self.src_prev_stream.device != cur_stream.device: AttributeError: 'CUDAStreamVariable' object has no attribute 'device' ``` This indicates a serious bug. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100766 Approved by: https://github.com/ezyang
Author
Committer
Parents
Loading