Enables _do_cuda_non_default_stream (#25989)
Summary:
Now that backward reuses forward streams calls to backward no longer need to be explicitly synced (in the great majority of cases). This is an opportunity to enable the _do_cuda_non_default_stream flag, which this PR does for test_cuda.py and test_distributions.py, where the flag was previously defined but set to false.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25989
Test Plan: Test changes the entire test suite, so the test suite is the test plan.
Differential Revision: D17329233
Pulled By: mruberry
fbshipit-source-id: 52f65b5ed53de26e35e6d022658d7fac22609f6a