Optimize PT2 Compile Events ingestion and column formats
Summary:
X-link: https://github.com/pytorch/pytorch/pull/139309
Per discussion from https://fb.workplace.com/groups/1286739428954016/posts/1360522894909002
This diff considerably changes the column format of PT2 Compile Events. We only log to scuba for a set of dynamo_timed() events that we actually care about aggregating. To do so, we add a boolean to dynamo_timed() that decides whether or not to log a pt2_compile_event. We'll always log a chromium_event for every dynamo_timed(), but only log a subset of those to scuba.
Logging all metadata into a metadata column saves space and ingestion because for any new rows that are not the same event, you don't get N new empty column markers. It comes at the cost of having to create new derived columns in the Scuba UI for using all the extra metadata we care about. But that's a tradeoff we're willing to make here, considering other tables like dynamo_compile exists.
ghstack-source-id: 251214365
exported-using-ghexport
Reviewed By: oulgen
Differential Revision: D65225598
fbshipit-source-id: 01569a79174ed3699063dbd8bb26b883c6a7b0c4