DRILL-6951: Merge row set based mock data source
The mock data source is used in several tests to generate a large volume
of sample data, such as when testing spilling. The mock data source also
lets us try new plugin featues in a very simple context. During the
development of the row set framework, the mock data source was converted
to use the new framework to verify functionality. This commit upgrades
the mock data source with that work.
The work changes non of the functionality. It does, however, improve
memory usage. Batchs are limited, by default, to 10 MB in size. The row
set framework minimizes internal fragmentation in the largest vector.
(Previously, internal fragmentation averaged 25% but could be as high as
50%.)
As it turns out, the hash aggregate tests depended on the internal
fragmentation: without it, the hash agg no longer spilled for the same
row count. Adjusted the generated row counts to recreate a data volume
that caused spilling.
One test in particular always failed due to assertions in the hash agg
code. These seem true bugs and are described in DRILL-7301. After
multiple failed attempts to get the test to work, it ws disabled until
DRILL-7301 is fixed.
Added a new unit test to sanity check the mock data source. (No test
already existed for this functionality except as verified via other unit
tests.)