Clip Binomial results for different endpoints in curand_uniform (#42702)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/42153
As [documented](https://docs.nvidia.com/cuda/curand/device-api-overview.html) (search for `curand_uniform` on the page), `curand_uniform` returns "from 0.0 to 1.0, where 1.0 is included and 0.0 is excluded." These endpoints are different than the CPU equivalent, and makes the calculation in the PR fail when the value is 1.0.
The test from the issue is added, it failed for me consistently before the PR even though I cut the number of samples by 10.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42702
Reviewed By: gchanan
Differential Revision: D23107451
Pulled By: ngimel
fbshipit-source-id: 3575d5b8cd5668e74b5edbecd95154b51aa485a1