[Dynamo, Compiled] Save some python overhead when calling compiled function with many tangents (#118730)
When a dynamo backend captures the entire forward pass and the entire backward pass without graph break, there could be many (per my memory, hundreds or thousands for big model) `contiguous` calls. Here we can save those overhead by checking `is_contiguous` before `contigous` call.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118730
Approved by: https://github.com/thiagocrepaldi, https://github.com/ezyang