Fix test_profiler_seq_nr flakiness (on macos) (#91019)
Fixes https://github.com/pytorch/pytorch/issues/66893
On MacOS, two `aten::sum` calls are reported sometimes where there should be only one. This can be easily reproduced by running `pytest test_autograd.py -k test_profiler_seq_nr --verbose --flake-finder` to see the flakiness. The profile result when the test fails is as follows (sorted by CPU):
```
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
aten::randn 16.67% 3.000us 27.78% 5.000us 2.500us 2
aten::sum 16.67% 3.000us 27.78% 5.000us 2.500us 2
aten::normal_ 11.11% 2.000us 11.11% 2.000us 1.000us 2
aten::add 11.11% 2.000us 11.11% 2.000us 2.000us 1
autograd::engine::evaluate_function: torch::autograd... 11.11% 2.000us 27.78% 5.000us 2.500us 2
torch::autograd::AccumulateGrad 11.11% 2.000us 16.67% 3.000us 1.500us 2
aten::ones_like 5.56% 1.000us 5.56% 1.000us 1.000us 1
autograd::engine::evaluate_function: SumBackward0 5.56% 1.000us 11.11% 2.000us 2.000us 1
aten::expand 5.56% 1.000us 5.56% 1.000us 1.000us 1
aten::copy_ 5.56% 1.000us 5.56% 1.000us 0.500us 2
aten::empty 0.00% 0.000us 0.00% 0.000us 0.000us 2
aten::as_strided 0.00% 0.000us 0.00% 0.000us 0.000us 2
aten::fill_ 0.00% 0.000us 0.00% 0.000us 0.000us 2
aten::empty_like 0.00% 0.000us 0.00% 0.000us 0.000us 1
aten::empty_strided 0.00% 0.000us 0.00% 0.000us 0.000us 3
SumBackward0 0.00% 0.000us 5.56% 1.000us 1.000us 1
autograd::engine::evaluate_function: AddBackward0 0.00% 0.000us 0.00% 0.000us 0.000us 1
AddBackward0 0.00% 0.000us 0.00% 0.000us 0.000us 1
aten::new_empty_strided 0.00% 0.000us 0.00% 0.000us 0.000us 2
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------
Self CPU time total: 18.000us
```
When it happens, the two `aten::sum` calls have different inputs:
```
aten::sum 4.35% 1.000us 13.04% 3.000us 3.000us 1 [[10, 10], []]
aten::sum 8.70% 2.000us 8.70% 2.000us 2.000us 1 [[10, 10], [], [], []]
```
I'm not sure what is the internal difference between `z.sum()` and `z.sum(dim=None)` here on MacOS, I thought they are the same.
### Testing
`pytest test_autograd.py -k test_profiler_seq_nr --verbose --flake-finder` to run the test 50 times, all pass.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91019
Approved by: https://github.com/malfet