Warn on failure to end warmup, add explicit api for start of model invocation (#101129)
CUDAGraph trees needs to known when you are doing a new invocation of your model. We have two heuristics for that :
- you invoke torch.compile again (like as a top level module you compiled)
- you have run a forward with a corresponding backward that hasn't been invoked yet, in which case we ignore the above
This doesn't always get it right, especially if you forget to use torch.no_grad() in inference. This adds a warning for that case, and adds an explicit `cudagraph_mark_step_begin` api.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101129
Approved by: https://github.com/ezyang