Add CUDA Sanitizer (#83984)
Example of a simple synchronization error:
```
a = torch.rand(4, 2, device="cuda")
with torch.cuda.stream(second_stream):
torch.mul(a, 5, out=a)
```
Output produced by CSAN:
```
============================
CSAN detected a possible data race on tensor with data pointer 139719969079296
Access by stream 94646435460352 during kernel:
aten::mul.out(Tensor self, Tensor other, *, Tensor(a!) out) -> Tensor(a!)
writing to argument: self, out, output
With stack trace:
File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 364, in _handle_kernel_launch
stack_trace = traceback.StackSummary.extract(
File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 544, in __torch_dispatch__
errors = self.event_handler._handle_kernel_launch(
File "/private/home/sypniewski/pytorch/torch/utils/_python_dispatch.py", line 76, in wrapped
return f(self, *args, **kwargs)
File "/private/home/sypniewski/pytorch/tester.py", line 9, in <module>
torch.mul(a, 5, out=a)
Previous access by stream 0 during kernel:
aten::rand(int[] size, *, int? dtype=None, int? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
writing to argument: output
With stack trace:
File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 364, in _handle_kernel_launch
stack_trace = traceback.StackSummary.extract(
File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 544, in __torch_dispatch__
errors = self.event_handler._handle_kernel_launch(
File "/private/home/sypniewski/pytorch/torch/utils/_python_dispatch.py", line 76, in wrapped
return f(self, *args, **kwargs)
File "/private/home/sypniewski/pytorch/tester.py", line 6, in <module>
a = torch.rand(10000, device="cuda")
Tensor was allocated with stack trace:
File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 420, in _handle_memory_allocation
traceback.StackSummary.extract(
File "/private/home/sypniewski/pytorch/torch/utils/_cuda_trace.py", line 23, in fire_callbacks
cb(*args, **kwargs)
File "/private/home/sypniewski/pytorch/torch/_ops.py", line 60, in __call__
return self._op(*args, **kwargs or {})
File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 541, in __torch_dispatch__
outputs = func(*args, **kwargs)
File "/private/home/sypniewski/pytorch/torch/utils/_python_dispatch.py", line 76, in wrapped
return f(self, *args, **kwargs)
File "/private/home/sypniewski/pytorch/tester.py", line 6, in <module>
a = torch.rand(10000, device="cuda")
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83984
Approved by: https://github.com/ezyang