[PyTorch] Add test for all-masked case for native softmax
It returns all NaNs. CUDA implementation required a fix for this.
Differential Revision: [D35327730](https://our.internmc.facebook.com/intern/diff/D35327730/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75803
Approved by: https://github.com/ngimel