Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-6118: Handle item star columns during project / filter push down and directory pruning

1. Added DrillFilterItemStarReWriterRule to re-write item star fields to regular field references.

2. Refactored DrillPushProjectIntoScanRule to handle item star fields, factored out helper classes and methods from PreUitl.class.

3. Fixed issue with dynamic star usage (after Calcite upgrade old usage of star was still present, replaced WILDCARD -> DYNAMIC_STAR for clarity).

4. Added unit tests to check project / filter push down and directory pruning with item star.

  1. … 26 more files in changeset.
DRILL-6049: Misc. hygiene and code cleanup changes

close apache/drill#1085

  1. … 123 more files in changeset.
DRILL-5919: Add non-numeric support for JSON processing

1. Added two session options store.json.reader.non_numeric_numbers and store.json.reader.non_numeric_numbers that allow to read/write NaN and Infinity as numbers. By default these options

are set to true.

2. Extended signature of convert_toJSON and convert_fromJSON functions by adding second optional parameter

that enables/disables read/write NaN and Infinity. By default it is set true.

3. Added unit tests with nan, infitity values for math and aggregate functions

4. Replaced JsonReader's constructors with builder.

This closes #1026

  1. … 16 more files in changeset.
DRILL-5864: Selecting a non-existing field from a MapR-DB JSON table fails with NPE.

    • -0
    • +94
    ./JsonReaderUtils.java
  1. … 4 more files in changeset.
DRILL-5355: Misc. code cleanup closes #784

  1. … 23 more files in changeset.
DRILL-3562: Query fails when using flatten on JSON data where some documents have an empty array

closes #713

  1. … 2 more files in changeset.
DRILL-4653: Malformed JSON should not stop the entire query from progressing

This closes #518

  1. … 8 more files in changeset.
DRILL-4479: For empty fields under all_text_mode enabled (a) use varchar for the default columns and (b) ensure we create fields corresponding to all columns.

close apache/drill#420

  1. … 3 more files in changeset.
DRILL-4184: Support variable length decimal fields in parquet

  1. … 68 more files in changeset.
DRILL-4382: Remove dependency on drill-logical from vector package

  1. … 80 more files in changeset.
DRILL-2288: Fix ScanBatch violation of IterOutcome protocol and downstream chain of bugs.

Increments:

2288: Pt. 1 Core: Added unit test. [Drill2288GetColumnsMetadataWhenNoRowsTest, empty.json]

2288: Pt. 1 Core: Changed HBase test table #1's # of regions from 1 to 2. [HBaseTestsSuite]

Also added TODO(DRILL-3954) comment about # of regions.

2288: Pt. 2 Core: Documented IterOutcome much more clearly. [RecordBatch]

Also edited some related Javadoc.

2288: Pt. 2 Hyg.: Edited doc., added @Override, etc. [AbstractRecordBatch, RecordBatch]

Purged unused SetupOutcome.

Added @Override.

Edited comments.

Fix some comments to doc. comments.

2288: Pt. 3 Core&Hyg.: Added validation of IterOutcome sequence. [IteratorValidatorBatchIterator]

Also:

Renamed internal members for clarity.

Added comments.

2288: Pt. 4 Core: Fixed a NONE -> OK_NEW_SCHEMA in ScanBatch.next(). [ScanBatch]

(With nearby comments.)

2288: Pt. 4 Hyg.: Edited comments, reordered, whitespace. [ScanBatch]

Reordered

Added comments.

Aligned.

2288: Pt. 4 Core+: Fixed UnionAllRecordBatch to receive IterOutcome sequence right. (3659) [UnionAllRecordBatch]

2288: Pt. 5 Core: Fixed ScanBatch.Mutator.isNewSchema() to stop spurious "new schema" reports (fix short-circuit OR, to call resetting method right). [ScanBatch]

2288: Pt. 5 Hyg.: Renamed, edited comments, reordered. [ScanBatch, SchemaChangeCallBack, AbstractSingleRecordBatch]

Renamed getSchemaChange -> getSchemaChangedAndReset.

Renamed schemaChange -> schemaChanged.

Added doc. comments.

Aligned.

2288: Pt. 6 Core: Avoided dummy Null.IntVec. column in JsonReader when not needed (MapWriter.isEmptyMap()). [JsonReader, 3 vector files]

2288: Pt. 6 Hyg.: Edited comments, message. Fixed message formatting. [RecordReader, JSONFormatPlugin, JSONRecordReader, AbstractMapVector, JsonReader]

Fixed message formatting.

Edited comments.

Edited message.

Fixed spurious line break.

2288: Pt. 7 Core: Added column families in HBaseRecordReader* to avoid dummy Null.IntVec. clash. [HBaseRecordReader]

2288: Pt. 8 Core.1: Cleared recordCount in OrderedPartitionRecordBatch.innerNext(). [OrderedPartitionRecordBatch]

2288: Pt. 8 Core.2: Cleared recordCount in ProjectRecordBatch.innerNext. [ProjectRecordBatch]

2288: Pt. 8 Core.3: Cleared recordCount in TopNBatch.innerNext. [TopNBatch]

2288: Pt. 9 Core: Had UnorderedReceiverBatch reset RecordBatchLoader's record count. [UnorderedReceiverBatch, RecordBatchLoader]

2288: Pt. 9 Hyg.: Added comments. [RecordBatchLoader]

2288: Pt. 10 Core: Worked around mismatched map child vectors in MapVector.getObject(). [MapVector]

2288: Pt. 11 Core: Added OK_NEW_SCHEMA schema comparison for HashAgg. [HashAggTemplate]

2288: Pt. 12 Core: Fixed memory leak in BaseTestQuery's printing.

Fixed bad skipping of RecordBatchLoader.clear(...) and

QueryDataBatch.load(...) for zero-row batches in printResult(...).

Also, dropped suppression of call to

VectorUtil.showVectorAccessibleContent(...) (so zero-row batches are

as visible as others).

2288: Pt. 13 Core: Fixed test that used unhandled periods in column alias identifiers.

2288: Misc.: Added # of rows to showVectorAccessibleContent's output. [VectorUtil]

2288: Misc.: Added simple/partial toString() [VectorContainer, AbstractRecordReader, JSONRecordReader, BaseValueVector, FieldSelection, AbstractBaseWriter]

2288: Misc. Hyg.: Added doc. comments to VectorContainer. [VectorContainer]

2288: Misc. Hyg.: Edited comment. [DrillStringUtils]

2288: Misc. Hyg.: Clarified message for unhandled identifier containing period.

2288: Pt. 3 Core&Hyg. Upd.: Added schema comparison result to logging. [IteratorValidatorBatchIterator]

2288: Pt. 7 Core Upd.: Handled HBase columns too re NullableIntVectors. [HBaseRecordReader, TestTableGenerator, TestHBaseFilterPushDown]

Created map-child vectors for requested columns.

Added unit test method testDummyColumnsAreAvoided, adding new row to test table,

updated some row counts.

2288: Pt. 7 Hyg. Upd.: Edited comment. [HBaseRecordReader]

2288: Pt. 11 Core Upd.: REVERTED all of bad OK_NEW_SCHEMA schema comparison for HashAgg. [HashAggTemplate]

This reverts commit 0939660f4620c03da97f4e1bf25a27514e6d0b81.

2288: Pt. 6 Core Upd.: Added isEmptyMap override in new (just-rebased-in) PromotableWriter. [PromotableWriter]

Adjusted definition and default implementation of isEmptyMap (to handle MongoDB

storage plugin's use of JsonReader).

2288: Pt. 6 Hyg. Upd.: Purged old atLeastOneWrite flag. [JsonReader]

2288: Pt. 14: Disabled newly dying test testNestedFlatten().

  1. … 38 more files in changeset.
DRILL-3229: Miscellaneous Union-type fixes

closes #207

closes #180

  1. … 38 more files in changeset.
DRILL-3229: Implement Union type vector

  1. … 53 more files in changeset.
DRILL-3773: Fix Mongo FieldSelection

Mongo plugin was previously rewriting a complex (multi-level) column reference as a simple selection of the top level field.

This changeset does not change this behavior in terms of the filter sent to mongo, but it add the original selected column to the list that will be read in by the JSON reader once that data is returned from mongo.

What this means is that we will be requesting more data from mongo that necessary (as we were previously), but this will be leveraging the existing functionality in the JSON reader to grab only the sub-selection actually requested in the query. This allows for difficult schema changes to be avoided by projecting only columns without schema changes.

This also fixes and adds unit tests for FieldSelection that cause wrong results when selecting a nested column and its parent.

  1. … 7 more files in changeset.
DRILL-1942-hygiene

- Formatting

- @Overrides

- finals

- some AutoCloseable additions

- new isCancelled() abstract method on FragmentManager, implemented on subclasses

Added missing new abstract method isCancelled()

Close apache/drill#120

  1. … 23 more files in changeset.
DRILL-2879: Part 2 - Enhancing extended json support for date in millies and binary with type info

Addressing review comments

Updated unit test to remove timezone that was being pulled from the local system

(and thus failed to match the baseline if run from a different timezone)

  1. … 3 more files in changeset.
DRILL-2879: Enhancing extended json support for date in millies and binary with type info

Ignore project push down Mongo test until test completes correctly on Linux

  1. … 4 more files in changeset.
DRILL-3476: Merge paths in FieldSelection

Conflicts:

exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java

  1. … 2 more files in changeset.
DRILL-3319: Replaced UserException#build() method with #build(Logger) method to log from the correct class

+ Fixed docs in UserException

+ Created loggers, and changed logger visibility to private

  1. … 37 more files in changeset.
DRILL-3019: In JsonReader, update atLeastOneWrite to true if writeListDataIfTyped and writeMapDataIfTyped write a value

  1. … 1 more file in changeset.
DRILL-2350: Improve exception handling and error messages in JSON reader.

  1. … 7 more files in changeset.
DRILL-2193: implement fast count / skip-all semantics for JSON reader

  1. … 4 more files in changeset.
DRILL-2695: Add Support for large in conditions through the use of the Values operator. Update JSON reader to support reading Extended JSON. Update JSON writer to support writing extended JSON data. Update JSON reader to automatically unwrap a file that includes a single top-level array (used by values). Update Options manager to use getOption(<Type>Validator) to directly retrieve typed value. Remove JSON rewinding Add support for CONVERT_TO( [], 'SIMPLEJSON') to disable extended types as part of udf use.

    • -0
    • +530
    ./BasicJsonOutput.java
    • -0
    • +37
    ./DateOutputFormat.java
    • -0
    • +183
    ./ExtendedJsonOutput.java
    • -0
    • +29
    ./ExtendedTypeName.java
    • -0
    • +109
    ./JsonOutput.java
    • -0
    • +295
    ./VectorOutput.java
    • -0
    • +70
    ./WorkingBuffer.java
  1. … 56 more files in changeset.
DRILL-1460: Implement "read_numbers_as_double" option for JSON reader

Conflicts:

contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java

exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java

exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertFrom.java

exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java

exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java

exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java

exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java

  1. … 7 more files in changeset.
DRILL-1871: Compressed JSON read support.

  1. … 3 more files in changeset.
DRILL-1764: return max value capacity if vector has no children

  1. … 6 more files in changeset.
DRILL-1671, DRILL-1653, DRILL-1652: Fixes for flatten bugs

  1. … 13 more files in changeset.
DRILL-1774: Update JSON Reader to do single pass reading and better use Jackson's interning. Also improve projection pushdown support.

    • -0
    • +59
    ./DrillBufInputStream.java
    • -0
    • +154
    ./FieldSelection.java
  1. … 21 more files in changeset.
DRILL-1547: enforce writers to explicitly check for buffer bounds to avoid IndexOutOfBounds errors; make writer hierarchy to stop immediately in case of a write error

  1. … 15 more files in changeset.
DRILL-98: MongoDB storage plugin

This commit disables MongoDB PStore due to changes to the PStore interface.

  1. … 28 more files in changeset.