Avoid recursion in graph traverse (#95723)
It's easy to reach recursion limit in Python when calling `dfs_find_cycle` in big graphs (e.g., searching for attention heads in GPT-2 via SubgraphMatcher). Let's switch to queue-based graph tarversing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95723
Approved by: https://github.com/SherlockNoMad, https://github.com/Skylion007