fix functionalization <> fake tensor mode (#83701)
The bug is that:
(1) functionalization kernels internally call `at::empty_strided()` to construct meta tensors, and then call the meta tensor op
(2) This happens with the Python dispatch key already added to the TLS exclude set, so we expect these meta tensors never to enter python
(3) When calling detach() though, `TensorImpl::shallow_copy_and_detach()` will currently always call into python when a PythonMode is set. Instead, I updated it to check if the Python key is in the TLS exclude set first.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83701
Approved by: https://github.com/ezyang