Fix inf norm grad (reland) (#48611)
Summary:
Reland of https://github.com/pytorch/pytorch/issues/48122
Does this result in a regression? No significant regression observed.
Timer script:
```
import torch
from torch.utils.benchmark import Timer
setup="""
a = torch.rand((2, 2), requires_grad=True)
gradient = torch.ones(2)
"""
stmt="""
torch.autograd.grad(torch.norm(a, dim=(0,), keepdim=False), a, gradient)
"""
timer = Timer(stmt, setup)
print(timer.timeit(10000))
print(timer.collect_callgrind(100))
```
Note: small matrix, keepdim is False, and dims is non-empty
Before change
```
Runtime 37.37 us
1 measurement, 10000 runs , 1 thread
All Noisy symbols removed
Instructions: 15279045 15141710
Baseline: 4257 3851
100 runs per measurement, 1 thread
```
After change
```
Runtime 36.08 us
1 measurement, 10000 runs , 1 thread
All Noisy symbols removed
Instructions: 15296974 15153534
Baseline: 4257 3851
100 runs per measurement, 1 thread
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48611
Reviewed By: albanD, mruberry
Differential Revision: D25309997
Pulled By: soulitzer
fbshipit-source-id: 5fb950dc9259234342985c0e84ada25a7e3814d6