Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7506: Simplify code gen error handling

Pushes code gen error handling close to the code gen itself to

allow clearer error messages. Doing so avoids the need to bubble

code gen exceptions up the call stack, resulting in cleaner

operator code.

closes #1948

  1. … 40 more files in changeset.
DRILL-7393: Revisit Drill tests to ensure that patching is executed before any test run

- Added BaseTest with patchers and extended all tests from it.

- Added a test to java-exec module to ensure that all tests there are inherited from BaseTest.

- Revised exception handling in the patchers, now it's individual for each patching method.

closes #1910

    • -1
    • +2
    ./metadata/schema/TestSchemaProvider.java
  1. … 138 more files in changeset.
DRILL-7413: Test and fix scan operator vectors

Enables vector validation tests for the ScanBatch and all

EasyFormat plugins. Fixes a bug in scan batch that failed to set

the record count in the output container.

Fixes a number of formatting and other issues found while adding

the tests.

    • -190
    • +186
    ./vector/TestDateTypes.java
  1. … 6 more files in changeset.
DRILL-7412: Minor unit test improvements

Many tests intentionally trigger errors. A debug-only log setting

sent those errors to stdout. The resulting stack dumps simply cluttered

the test output, so disabled error output to the console.

Drill can apply bounds checks to vectors. Tests run via Maven

enable bounds checking. Now, bounds checking is also enabled in

"debug mode" (when assertions are enabled, as in an IDE.)

Drill contains two test frameworks. The older BaseTestQuery was

marked as deprecated, but many tests still use it and are unlikely

to be changed soon. So, removed the deprecated marker to reduce the

number of spurious warnings.

Also includes a number of minor clean-ups.

closes #1876

  1. … 17 more files in changeset.
DRILL-7359: Add support for DICT type in RowSet Framework

closes #1870

  1. … 82 more files in changeset.
DRILL-7168: Implement ALTER SCHEMA ADD / REMOVE commands

    • -4
    • +19
    ./metadata/schema/TestSchemaProvider.java
  1. … 15 more files in changeset.
DRILL-7350: Move RowSet related classes from test folder

  1. … 291 more files in changeset.
DRILL-7314: Use TupleMetadata instead of concrete implementation

1. Add ser / de implementation for TupleMetadata interface based on types.

2. Replace TupleSchema usage where possible.

3. Move patcher classes into commons.

4. Upgrade some dependencies and general refactoring.

    • -0
    • +1
    ./metadata/schema/TestSchemaProvider.java
  1. … 39 more files in changeset.
DRILL-7310: Move schema-related classes from exec module to be able to use them in metastore module

closes #1816

    • -209
    • +0
    ./metadata/TestMetadataProperties.java
    • -355
    • +0
    ./metadata/schema/parser/TestSchemaParser.java
  1. … 96 more files in changeset.
DRILL-7273: Introduce operators for handling metadata

closes #1886

  1. … 156 more files in changeset.
DRILL-7278: Refactor result set loader projection mechanism

Drill 1.16 added a enhanced scan framework based on the row set

mechanisms, and a "provisioned schema" feature build on top

of that framework. Conversion of the log reader plugin to use

the framework identified additional features we wish to add,

such as marking a column as "special" (not expanded in a wildcard

query.)

This work identified that the code added for provisioned schemas in

Drill 1.16 worked, but is a bit overly complex, making it hard to add

the desired new feature.

This patch refactors the "reader" projection code:

* Create a "projection set" mechanism that the reader can query to ask,

"the caller just added a column. Should it be projected or not?"

* Unifies the type conversion mechanism added as part of provisioned

schemas.

* Added the "special column" property for both "reader" and "provided"

schemas.

* Verified that provisioned schemas work with maps (at least on the scan

framework side.)

* Replaced the previous "schema transformer" mechanism with a new "type

conversion" mechanism that unifies type conversion, provided schemas

and an optional custom type conversion mechanism.

* Column writers can report if they are projected. Moved this query

from metadata to the column writer itself.

* Extended and clarified documentation of the feature.

* Revised and/or added unit tests.

closes #1797

    • -20
    • +0
    ./metadata/TestMetadataProperties.java
  1. … 72 more files in changeset.
DRILL-7159: Fix typeString method to return correct name for MAP (aka STRUCT) closes #1741

    • -1
    • +6
    ./metadata/schema/TestSchemaProvider.java
  1. … 2 more files in changeset.
DRILL-7143: Support default value for empty columns

Modifies the prior work to add default values for columns. The prior work added defaults

when the entire column is missing from a reader (the old Nullable Int column). The Row

Set mechanism now will also "fill empty" slots with the default value.

Added default support for the column writers. The writers automatically obtain the

default value from the column schema. The default can also be set explicitly on

the column writer.

Updated the null column mechanism to use this feature rather than the ad-hoc

implemention in the prior commit.

Semantics changed a bit. Only Required columns take a default. The default value

is ignored or nullable columns since nullable columns already have a file default: NULL.

Other changes:

* Updated the CSV-with-schema tests to illustrate the new behavior.

* Made multiple fixes for Boolean and Decimal columns and added unit tests.

* Upgraded Fremarker to version 2.3.28 to allow use of the continue statement.

* Reimplemented the Bit column reader and writer to use the BitVector directly since this vector is rather special.

* Added get/set Boolean methods for column accessors

* Moved the BooleanType class to the common package

* Added more CSV unit tests to explore decimal types, booleans, and defaults

* Add special handling for blank fields in from-string conversions

* Added options to the conversion factory to specify blank-handling behavior.

CSV uses a mapping of blanks to null (nullable) or default value (non-nullable)

closes #1726

  1. … 72 more files in changeset.
DRILL-7096: Develop vector for canonical Map<K,V>

- Added new type DICT;

- Created value vectors for the type for single and repeated modes;

- Implemented corresponding FieldReaders and FieldWriters;

- Made changes in EvaluationVisitor to be able to read values from the map by key;

- Made changes to DrillParquetGroupConverter to be able to read Parquet's MAP type;

- Added an option `store.parquet.reader.enable_map_support` to disable reading MAP type as DICT from Parquet files;

- Updated AvroRecordReader to use new DICT type for Avro's MAP;

- Added support of the new type to ParquetRecordWriter.

    • -4
    • +16
    ./ExpressionTreeMaterializerTest.java
    • -0
    • +459
    ./vector/TestDictVector.java
  1. … 106 more files in changeset.
DRILL-6965: Implement schema table function parameter

1. Added common schema table function parameter with can be used as single unit or with format plugin table function parameters.

2. Allowed creating schema without columns, in case if user needs only to indicate table properties.

3. Added unit tests.

closes #1777

    • -6
    • +37
    ./metadata/schema/TestSchemaProvider.java
    • -22
    • +61
    ./metadata/schema/parser/TestSchemaParser.java
  1. … 28 more files in changeset.
DRILL-7011: Support schema in scan framework

* Adds schema support to the row set-based scan framework and to the "V3" text reader based on that framework.

* Adding the schema made clear that passing options as a long list of constructor arguments was not sustainable. Refactored code to use a builder pattern instead.

* Added support for default values in the "null column loader", which required adding a "setValue" method to the column accessors.

* Added unit tests for all new or changed functionality. See TestCsvWithSchema for the overall test of the entire integrated mechanism.

* Added tests for explicit projection with schema

* Better handling of date/time in column accessors

* Converted recent column metadata work from Java 8 date/time to Joda.

* Added more CSV-with-schema unit tests

* Removed the ID fields from "resolved columns", used "instanceof" instead.

* Added wildcard projection with an output schema. Handles both "lenient" and "strict" schemas.

* Tagged projection columns with their output schema, when available.

* Scan projection added modes for wildcard with an output schema. The reader projection added support for merging reader and output schemas.

* Includes refactoring of scan operator tests (the test file grew too large.)

* Renamed some classes to avoid confusing reader schemas with output schemas.

* Added unit tests for the new functionality.

* Added "lenient" wildcard with schema test for CSV

* Added more type conversions: string-to-bit, many-to-string

* Fixed bug in column writer for VarDecimal

* Added missing unit tests, and fixed bugs, in Bit column reader/writer

* Cleaned up a number of unneded "SuppressWarnings"

closes #1711

  1. … 222 more files in changeset.
DRILL-7092: Rename map to struct in schema definition 1. Renamed map to struct in schema parser. 2. Updated sqlTypeOf function to return STRUCT instead of MAP, drillTypeOf function will return MAP as before until internal renaming is done. 3. Add is_struct alias to already existing is_map function. Function should be revisited once Drill supports true maps. 4. Updated unit tests.

closes #1688

    • -11
    • +11
    ./metadata/schema/parser/TestSchemaParser.java
  1. … 5 more files in changeset.
DRILL-7086: Output schema for row set mechanism

Enhances the row set mechanism to take an "output schema" that describes the vectors to

create. The "input schema" describes the type that the reader would like to write. A

conversion mechanism inserts a conversion shim to convert from the input to output type.

Provides a set of implicit type conversions, including string-to-date/time conversions

which use the new format property stored in column metadata. Includes unit tests for

the new functionality.

closes #1690

    • -0
    • +229
    ./metadata/TestMetadataProperties.java
    • -0
    • +824
    ./metadata/TestTupleSchema.java
    • -5
    • +8
    ./metadata/schema/TestSchemaProvider.java
  1. … 61 more files in changeset.
DRILL-7073: CREATE SCHEMA command / TupleSchema / ColumnMetadata improvements

1. Add format, default, column properties logic.

2. Changed schema JSON after serialization.

3. Added appropriate unit tests.

closes #1684

    • -41
    • +92
    ./metadata/schema/TestSchemaProvider.java
    • -41
    • +78
    ./metadata/schema/parser/TestSchemaParser.java
  1. … 12 more files in changeset.
DRILL-6903: SchemaBuilder code improvements

1. ColumnBuilder: setPrecisionAndScale method

2. SchemaContainer: addColumn method parameter AbstractColumnMetadata was changed to ColumnMetadata

3. MapBuilder / RepeatedListBuilder / UnionBuilder: added constructors without parent, made buildColumn method public

4. TupleMetadata: added toMetadataList method

5. Other refactoring

  1. … 26 more files in changeset.
DRILL-6901: Move schema builder to src/main

Moves the SchemaBuilder class out of the src/test name space into the src/main namespace. Specifically, into the existing record.metadata package.

Many files changed in this move. Corrected two minor issues: import of the wrong Arrays class and unnecessary annotations.

  1. … 86 more files in changeset.
DRILL-6964: Implement CREATE / DROP SCHEMA commands

Note: this PR only adds support for CREATE / DROP SCHEMA commands which allow to store and delete schema. Schema usage during querying the data will be covered in other PRs.

1. Added parser methods / handles to parse CREATE / DROP schema commands.

2. Added SchemaProviders classes to separate ways of schema provision (file, table function).

3. Added schema parsing using ANTLR4 (lexer, parser, visitors).

4. Added appropriate unit tests.

close apache/drill#1615

    • -0
    • +232
    ./metadata/schema/TestSchemaProvider.java
    • -0
    • +157
    ./metadata/schema/parser/TestParserErrorHandling.java
    • -0
    • +279
    ./metadata/schema/parser/TestSchemaParser.java
  1. … 33 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

  1. … 980 more files in changeset.
DRILL-6656: Disallow extra semicolons and multiple statements on the same line.

closes #1415

  1. … 143 more files in changeset.
DRILL-6486: BitVector split and transfer does not work correctly for non byte-multiple transfer lengths

Fix for the bug in BitVector splitAndTransfer. The logic for handling copy of last-n bits was incorrect for none byte-multiple transfer lengths.

closes #1316

  1. … 2 more files in changeset.
DRILL-6438: Remove excess logging form the tests. - Removed usages of System.out and System.err from the test and replaced with loggers

closes #1284

  1. … 88 more files in changeset.
DRILL-6386: Remove unused imports and star imports.

  1. … 231 more files in changeset.
DRILL-6422: Update guava to 23.0 and shade it

- Fix compilation errors for new version of Guava.

- Remove usage of deprecated API

- Shade guava and add dependencies to the shaded version

- Ban unshaded package

- Introduce drill-shaded module and move guava-shaded under it

- Add methods to convert shaded guava lists to the unshaded ones

- Add instruction for publishing artifacts to the Apache repository

  1. … 81 more files in changeset.
DRILL-6334: Minor code cleanup

This closes #1213

  1. … 5 more files in changeset.
DRILL-6320: Fixed license headers.

closes #1207

  1. … 2064 more files in changeset.