Add nvFuser support for torch.native_batch_norm (#85562)
This PR adds nvFuser's implementation for batch_norm as there's no reference yet (https://github.com/pytorch/pytorch/pull/81191) and no in-place copy support (https://github.com/pytorch/pytorch/pull/84545).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85562
Approved by: https://github.com/kevinstephano, https://github.com/ngimel