pytorch
cf17fd6d - Fix multinomial CUDA misalignment and non-deterministic behavior (#55364)

Commit
3 years ago
Fix multinomial CUDA misalignment and non-deterministic behavior (#55364) Summary: Fixes https://github.com/pytorch/pytorch/issues/46702 - fails on probability distribution with odd items - trying to access an `acc_type` (`float`) in a `scalar_t` (`float16`) aligned memory - produce unrepeatable result for large input tensor - parallel cumsum not monotonic at some positions ### Fixes - computing cumsum on `acc_type` (`float`) instead of using `scalar_t` (`float16`) fixed both issues - the non-monotonic behavior may happen even using `float`, though - in these cases, deterministic behavior may be achieved by eliminating the race condition when writing the result, using the atomic function `atomicMax` Pull Request resolved: https://github.com/pytorch/pytorch/pull/55364 Reviewed By: mruberry Differential Revision: D28031666 Pulled By: ngimel fbshipit-source-id: 0fc6289e0b9ea2d31ef3771e7ca370de8f5c02de
Author
Parents
Loading