This PR adds a new SPI method, ConnectorSplitSource#getMetrics, to capture detailed metrics during splits generation and integrates these metrics into various scheduling and execution components. Key changes include:
Copilot reviewed 91 out of 91 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
core/trino-main/src/main/java/io/trino/operator/ScanFilterAndProjectOperator.java | Updated operator context instantiation to include sourceId. |
core/trino-main/src/main/java/io/trino/operator/OperatorStats.java | Introduced sourceId, added getter, and implemented addConnectorSplitSourceMetrics for merging metrics. |
core/trino-main/src/main/java/io/trino/operator/OperatorInfo.java | Removed obsolete SplitOperatorInfo mapping. |
core/trino-main/src/main/java/io/trino/operator/OperatorContext.java | Updated constructor and field to include sourceId. |
core/trino-main/src/main/java/io/trino/operator/DriverContext.java | Added overloaded addOperatorContext accepting sourceId. |
core/trino-main/src/main/java/io/trino/metadata/Split.java | Removed getInfo method. |
core/trino-main/src/main/java/io/trino/execution/scheduler/faulttolerant/* | Changed split time recorder to use metricsRecorder. |
core/trino-main/src/main/java/io/trino/execution/* | Updated methods to record and propagate split source metrics. |
core/trino-main/src/main/java/io/trino/connector/* | Removed or updated split info formatting methods. |
core/trino-main/src/main/java/io/trino/execution/SqlTaskExecution.java:897
return (partitionedSplit == null) ? "" : partitionedSplit.getSplit().toString();
Overall LGTM
This PR adds a new SPI method, ConnectorSplitSource#getMetrics, to allow connectors to expose detailed metrics from splits generation and update various components to record and display these metrics. Key changes include:
Copilot reviewed 91 out of 91 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
ScanFilterAndProjectOperator.java | Updated operator context creation to pass Optional sourceId. |
OperatorStats.java | Added sourceId field and withConnectorSplitSourceMetrics method to merge connector split metrics. |
OperatorInfo.java | Removed deprecated SplitOperatorInfo registration. |
OperatorContext.java | Injected Optional sourceId into constructors. |
DriverContext.java | Added overload to support Optional sourceId when adding an operator context. |
Split.java | Removed getInfo() method to deprecate split info representation. |
SplitSourceMetricsRecorder.java | Introduced new interface for recording metrics. |
EventDrivenTaskSourceFactory.java, EventDrivenTaskSource.java | Replaced BiConsumer time recorder with a metrics recorder, updating corresponding references. |
StageExecution.java, SourcePartitionedScheduler.java, PipelinedStageExecution.java, StageStats.java, StageStateMachine.java, SqlTaskExecution.java, SqlStage.java, QueryStateMachine.java | Updated methods to record and propagate split source metrics. |
SystemSplit.java, InformationSchemaSplit.java | Changed the split info formatting to rely on toString(). |
core/trino-main/src/main/java/io/trino/execution/scheduler/StageStateMachine.java:753
splitSourceMetrics.put(nodeId, metrics);
Login to write a write a comment.
Description
Add new SPI
io.trino.spi.connector.ConnectorSplitSource#getMetrics
This allows connectors to expose detailed metrics from splits generation.
EXPLAIN ANALYZE VERBOSE is modified to print these metrics.
Added an implementation in iceberg to collect metrics from iceberg metadata scan.
Example output
Remove SplitOperatorInfo and ConnectorSplit#getSplitInfo, this
avoids the need to send splits info from workers to coordinator.
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: