Add nvFuser support for aten.native_batch_norm_backward (#84546)
Replacing `tensor.reshape(broadcast_mask)` with unsqueezes makes the implementation of `batch_norm_backward` more friendly for PrimTorch+nvFuser.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84546
Approved by: https://github.com/Chillee