trino
Remove spooled metadata symbols & inline data on worker nodes
#25439
Merged

Remove spooled metadata symbols & inline data on worker nodes #25439

wendigo merged 2 commits into master from serafin/spooled-layout-symbols
wendigo
wendigo43 days ago (edited 42 days ago)

Description

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## General
* Improve scalability of inline data encoding in the spooled protocol ({issue}`issuenumber`)
cla-bot cla-bot added cla-signed
wendigo wendigo requested a review from kasiafi kasiafi 43 days ago
wendigo wendigo requested a review from losipiuk losipiuk 43 days ago
wendigo wendigo force pushed from d09f1128 to 823d222f 43 days ago
wendigo wendigo force pushed from 823d222f to 1c726dd9 43 days ago
wendigo wendigo changed the title Remove artificial spooling metadata symbol from layout Remove spooled metadata symbols & inline data on worker nodes 43 days ago
wendigo wendigo requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 43 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2025-03-27
copilot-pull-request-reviewer43 days ago

Pull Request Overview

This PR removes the usage of spooled metadata symbols and the SpooledBlock class from various components to inline metadata data on worker nodes. Key changes include:

  • Replacing references to the removed SpooledBlock with a new sealed interface (SpoolingMetadataBlock) and its serializer/deserializer.
  • Removing filtering and dependency handling based on spooling metadata symbols in the plan printers, analyzers, and cost/statistics estimators.
  • Removing construction and propagation of the SpoolingManagerRegistry from planners and query execution.

Reviewed Changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.

Show a summary per file
File Description
core/trino-main/src/test/java/io/trino/server/protocol/spooling/TestSpoolingMetadataBlockSerde.java Added tests for serialization round trip of new spooling metadata blocks.
core/trino-main/src/test/java/io/trino/server/protocol/spooling/TestSpooledBlock.java Removed obsolete test file for the removed SpooledBlock.
core/trino-main/src/main/java/io/trino/sql/planner/sanity/ValidateDependenciesChecker.java Updated dependency validations to stop filtering out spooling metadata symbols.
core/trino-main/src/main/java/io/trino/server/protocol/spooling/SpoolingQueryDataProducer.java Updated metadata handling to use a switch on the new SpoolingMetadataBlock types.
core/trino-main/src/main/java/io/trino/server/protocol/spooling/SpoolingMetadataBlockSerde.java New serialization logic supporting the sealed SpoolingMetadataBlock interface.
core/trino-main/src/main/java/io/trino/server/protocol/spooling/SpoolingMetadataBlock.java New sealed interface that replaces the removed SpooledBlock implementation.
core/trino-main/src/main/java/io/trino/operator/OutputSpoolingOperatorFactory.java Adjusted inline data handling and metadata layout to eliminate spooling metadata symbols.
core/trino-main/src/main/java/io/trino/execution/SqlQueryExecution.java Removed spoolingManagerRegistry dependency from query execution.
core/trino-main/src/main/java/io/trino/cost/PlanNodeStatsEstimate.java Removed spooling metadata filtering from statistics calculations.
Comments suppressed due to low confidence (6)

core/trino-main/src/main/java/io/trino/sql/planner/sanity/ValidateDependenciesChecker.java:500

  • Removing filtering on the spooling metadata symbol here changes the dependency list; please verify that including all output symbols (even metadata) in the dependency check is intended and will not introduce unexpected behavior downstream.
checkDependencies(source.getOutputSymbols(), node.getOutputSymbols(), "Invalid node. Output column dependencies (%s) not in source plan output (%s)", node.getOutputSymbols(), source.getOutputSymbols());

core/trino-main/src/main/java/io/trino/server/protocol/spooling/SpoolingQueryDataProducer.java:55

  • Ensure that the new switch statement covering both Spooled and Inlined cases correctly handles attribute updates (including currentOffset) and segment URI construction as expected.
switch (metadataBlock) {

core/trino-main/src/main/java/io/trino/operator/OutputSpoolingOperatorFactory.java:309

  • Verify that switching from creating an empty single row page using the old SpooledBlock to using the new SpoolingMetadataBlock.forSpooledLocation produces a correct page layout and that all required metadata attributes are preserved.
return SpoolingMetadataBlock.forSpooledLocation(spoolingManager.location(segmentHandle), attributes).serialize();

core/trino-main/src/main/java/io/trino/execution/SqlQueryExecution.java:159

  • Confirm that removing the spoolingManagerRegistry from query execution does not impact any downstream spooling functionality, and that any required spooling behavior has been adequately replicated elsewhere.
private final SpoolingManagerRegistry spoolingManagerRegistry;

core/trino-main/src/main/java/io/trino/cost/PlanNodeStatsEstimate.java:84

  • With the removal of spooling metadata symbols, ensure that excluding these symbols from output size estimations is no longer necessary and that statistics calculations remain accurate.
.filter(symbol -> !symbol.type().equals(SPOOLING_METADATA_TYPE))

core/trino-main/src/main/java/io/trino/server/protocol/spooling/SpoolingMetadataBlockSerde.java:37

  • [nitpick] Update the documentation to reflect the new ordering or purpose of the fields if any adjustments were made to the serialization layout.
* 0 - VARCHAR   - metadata
losipiuk
losipiuk approved these changes on 2025-03-27
wendigo Remove artificial spooling metadata symbol from layout
bb9d0c1e
wendigo Encode inline data in the spooling operator
d5eaadb7
wendigo wendigo force pushed from 1c726dd9 to d5eaadb7 42 days ago
wendigo wendigo merged c6c14699 into master 42 days ago
wendigo wendigo deleted the serafin/spooled-layout-symbols branch 42 days ago
github-actions github-actions added this to the 475 milestone 42 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone