Speed up transitive_dep_hash for singleton SCCs (#21390)
I noticed on a large codebase ~99.84% of SCCs are singletons.
So the two main changes are adding a singleton fast path and extracting
`graph[id]` to be a local variable so that it does the lookup once per
module rather than once per dependency,
The time in the `transitive_dep_hash` function drops 22%, and improved
the overall warm run by 1%