Account for forwards which whose corresponding backwards are not invoked (#98112)
Previously, when we would run a forward graph whose backward we never invoked it would prevent us from switching from warmup to recording. Now, refine the heuristic to allow incrementing the generation as soon as we invoke a backward graph. This still handles the
```
mod1 = torch.compile(...)
mod2 = torch.compile(...)
mod2(mod1(x)).sum().backward()
```
case while accounting for graphs which we may not run backward of.
It also now handles the case where we skip cudagraphify the backward of a forward.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98112
Approved by: https://github.com/jansel