benchmark
2713258b - Put "everything" WaitCounters in dynamo_timed (#151757)

Commit

1 year ago

Put "everything" WaitCounters in dynamo_timed (#151757) Summary: The main motivation is to capture the cudagraphs overhead in a WaitCounter. We'll combine that with Triton autotuning, and therefore rename to "compile_runtime_overheads". Since we have a couple WaitCounters where we want to capture all runtime and compile overheads, let's put the accounting in dynamo_timed so we'll automatically capture any toplevel timed regions that get added in the future. Also, dynamo_timed already has to figure out if we're timing a runtime vs. compile-time event, so we can reuse some of that logic. X-link: https://github.com/pytorch/pytorch/pull/151757 Approved by: https://github.com/ppanchalia ghstack dependencies: #151749 Reviewed By: wdvr Differential Revision: D73440149 fbshipit-source-id: 1b9074bef52b902da09001b4c006661c7d537477

Author

masnesral

Committer

facebook-github-bot

Parents

3dd99cb3

benchmark 2713258b - Put "everything" WaitCounters in dynamo_timed (#151757)

benchmark
2713258b - Put "everything" WaitCounters in dynamo_timed (#151757)