Optimize fused_dropout_kernel launch bounds for AMD hardware
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17870
Differential Revision: D14409990
Pulled By: ezyang
fbshipit-source-id: 0452282f459770823641b2527f47b1186ab14666