DRILL-4287: During initial DrillTable creation don't read the metadata cache file; instead do it during ParquetGroupScan.
Maintain state in FileSelection to keep track of whether certain operations have been done on that selection.
Remove ParquetFileSelection since its only purpose was to carry the metadata cache information which is not needed anymore.
Conflicts:
exec/java-exec/src/main/java/org/apache/drill/exec/planner/FileSystemPartitionDescriptor.java
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFileSelection.java
Resolve issues after rebasing:
1) JsonIgnore fileSelection in ParquetGroupScan
2) FileSysemPartitionDescriptor change.
Conflicts:
exec/java-exec/src/main/java/org/apache/drill/exec/planner/FileSystemPartitionDescriptor.java
DRILL-4287: Address code review comments and follow-up changes after rebasing:
- In FileSelection: updated call to the Stopwatch, set all flags appropriately in minusDirectories(), modify supportDirPruning()
- In ParquetGroupScan: Simplify directory checking in constructor, set the parquetTableMetadata field after reading metadata cache.
- Fix unit tests to use an alias for the reserved dir<N> columns as partition-by columns.
More follow-up changes:
- Get rid of fileSelection attribute in ParquetGroupScan
- Initialize entries after expanding the selection when metadata cache is used
- For non-metadata cache, don't do any expansion in the constructor; let init() handle it
- In FileSystemPartitionDescriptor, the createPartitionSublists is modified to check for parquet scan
When reading from metadata cache , ensure selection root does not contain the scheme and authority prefix. Minor refactoring.
Address code review comments and fix a bug. Simplify FileSelection state management based on review comment.
close apache/drill#376