DRILL-7293: Convert the regex ("log") plugin to use EVF
Converts the log format plugin (which uses a regex for parsing) to work
with the Extended Vector Format.
User-visible behavior changes added to the README file.
* Use the plugin config object to pass config to the Easy framework.
* Use the EVF scan mechanism in place of the legacy "ScanBatch"
mechanism.
* Minor code and README cleanup.
* Replace ad-hoc type conversion with builtin conversions
The provided schema support in the enhanced vector framework (EVF)
provides automatic conversions from VARCHAR to most types. The log
format plugin was created before EVF was available and provided its own
conversion mechanism. This commit removes the ad-hoc conversion code and
instead uses the log plugin config schema information to create an
"output schema" just as if it was provided by the provided schema
framework.
Because we need the schema in the plugin (rather than the reader), moved
the schema-parsing code out of the reader into the plugin. The plugin
creates two schemas: an "output schema" with the desired output types,
and a "reader schema" that uses only VARCHAR. This causes the EVF to
perform conversions.
* Enable provided schema support
Allows the user to specify types using either the format config (as
previously) or a provided schema. If a schema is provided, it will match
columns using names specified in the format config.
The provided schema can specify both types and modes (nullable or not
null.)
If a schema is provided, then the types specified in the plugin config
are ignored. No attempt is made to merge schemas.
If a schema is provided, but a column is omitted from the schema, the
type defaults to VARCHAR.
* Added ability to specify regex in table properties
Allows the user to specify the regex, and the column schema,
using a CREATE SCHEMA statement. The README file provides the details.
Unit tests demonstrate and verify the functionality.
* Used the custom error context provided by EVF to enhance the log format
reader error messages.
* Added user name to default EVF error context
* Added support for table functions
Can set the regex and maxErrors fields, but not the schema.
Schema will default to "field_0", "field_1", etc. of type
VARCHAR.
* Added unit tests to verify the functionality.
* Added a check, and a test, for a regex with no groups.
* Added columns array support
When the log regex plugin is given no schema, it previously
created a list of columns "field_0", "field_1", etc. After
this change, the plugin instead follows the pattern set by
the text plugin: it will place all fields into the columns
array. (The two special fields are still separate.)
A few adjustments were necessary to the columns array
framework to allow use of the special columns along with
the `columns` column.
Modified unit tests and the README to reflect this change.
The change should be backward compatible because few users
are likely relying on the dummy field names.
Added unit tests to verify that schema-based table
functions work. A test shows that, due to the unforunate
config property name "schema", users of this plugin cannot
combine a config table function with the schema attribute
in the way promised in DRILL-6965.