Add pt2 graph breaks (#1858)
Summary:
Fixes https://github.com/pytorch/benchmark/issues/1849
Pull Request resolved: https://github.com/pytorch/benchmark/pull/1858
Test Plan:
```
$ buck2 run mode/opt //pytorch/benchmark:run -- blue_reels_vdd_v3 -d cuda -t train --torchdynamo inductor --torchinductor_cudagraph 0
GPU Time: 92.441 milliseconds
CPU Total Wall Time: 92.504 milliseconds
GPU 0 Peak Memory: 18.8922 GB
CPU Peak Memory: 12.9443 GB
PT2 Compilation time: 88.879 seconds
PT2 Graph Breaks: 1.000
```
Reviewed By: FindHao
Differential Revision: D48738620
Pulled By: xuzhao9
fbshipit-source-id: 406e09722634267f9ec4f08c4e2d93419021f11d