@Ben-Zvi Please review. This also fixes the code generation errors that I introduced in a previous commit.
@Ben-Zvi Added description of the changes
I compared the old-new codes; seem like a nice work - +1
One suggestion: Can take some of the implementation descriptions written for the PR and add them as JavaDoc for the new interface/classes.
211 | } | ||
212 | |||
213 | spilledBatchesCount += getBatchHolderCount(); // update count of spilled batches | ||
214 | } |
Minor comment: The original code was tracking and logging/tracing the number of rows spilled (and # batches) at this point. Why was that removed ?
@ilooner Could we proceed with this PR?
@vdiravka Got overloaded with some deadlines for another project. I won't be able to finish this for this release.
Login to write a write a comment.
Goal
The goal of this change is to move code for managing a partition in HashAgg into a seperate class, similar to the HashPartition class in HashJoinBatch. This has several advantages:
The original design doc describing the motivation behind the change and how it relates to memory calculations is here.
Implementation HashAggPartition
In order to achieve this I introduced some interfaces:
Then the relevant code for HashAggPartition, and HashAggMemoryCalculator calculator were moved out of HashAggTemplate and into implementation classes HashAggPartitionImpl and HashAggMemoryCalculatorImpl.
The rest of the code changes focus on replacing the usages of the following arrays with a single HashAggPartition array in HashAggTemplate:
Next Steps
This change is a first step to moving the code for a partition into a separate HashAggPartition class. I did not move all of it in this change in order to make review easier and to minimize the risk for bugs. A follow up change will complete moving all the code relevant to a partition into HashAggPartition. The main body of code that remains to be moved is most of the HashAggTemplate.checkGroupAndAggrValues function.
Additionally some debugging statements which printed out various partition stats were removed in this change. After the code for HashAggTemplate.checkGroupAndAggrValues is moved into HashAggPartition, I will make the stats available to print for debugging on the HashAggPartition class itself.