Clone Tools
  • last updated 12 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7362: COUNT(*) on JSON with outer list results in JsonParse error

closes #1849

  1. … 3 more files in changeset.
DRILL-7313: Use Hive schema for MaprDB native reader when field was empty

- Added all_text_mode option for hive maprDB Json

- Improved logic to convert Hive's schema into Drill's one

- Added unit tests for schema conversion

  1. … 27 more files in changeset.
DRILL-7011: Support schema in scan framework

* Adds schema support to the row set-based scan framework and to the "V3" text reader based on that framework.

* Adding the schema made clear that passing options as a long list of constructor arguments was not sustainable. Refactored code to use a builder pattern instead.

* Added support for default values in the "null column loader", which required adding a "setValue" method to the column accessors.

* Added unit tests for all new or changed functionality. See TestCsvWithSchema for the overall test of the entire integrated mechanism.

* Added tests for explicit projection with schema

* Better handling of date/time in column accessors

* Converted recent column metadata work from Java 8 date/time to Joda.

* Added more CSV-with-schema unit tests

* Removed the ID fields from "resolved columns", used "instanceof" instead.

* Added wildcard projection with an output schema. Handles both "lenient" and "strict" schemas.

* Tagged projection columns with their output schema, when available.

* Scan projection added modes for wildcard with an output schema. The reader projection added support for merging reader and output schemas.

* Includes refactoring of scan operator tests (the test file grew too large.)

* Renamed some classes to avoid confusing reader schemas with output schemas.

* Added unit tests for the new functionality.

* Added "lenient" wildcard with schema test for CSV

* Added more type conversions: string-to-bit, many-to-string

* Fixed bug in column writer for VarDecimal

* Added missing unit tests, and fixed bugs, in Bit column reader/writer

* Cleaned up a number of unneded "SuppressWarnings"

closes #1711

  1. … 223 more files in changeset.
DRILL-7060: Support JsonParser Feature 'ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER' (#1663)

  1. … 7 more files in changeset.
DRILL-6724: Dump operator context to logs when error occurs during query execution

closes #1455

  1. … 102 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

  1. … 980 more files in changeset.
DRILL-6386: Remove unused imports and star imports.

  1. … 228 more files in changeset.
DRILL-6242 Use java.time.Local{Date|Time|DateTime} for Drill Date, Time, Timestamp types. (#3)

close apache/drill#1247

* DRILL-6242 - Use java.time.Local{Date|Time|DateTime} classes to hold values from corresponding Drill date, time, and timestamp types.

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/ExtendedJsonOutput.java

Fix merge conflicts and check style.

  1. … 43 more files in changeset.
DRILL-6320: Fixed license headers.

closes #1207

  1. … 2055 more files in changeset.
DRILL-6118: Handle item star columns during project / filter push down and directory pruning

1. Added DrillFilterItemStarReWriterRule to re-write item star fields to regular field references.

2. Refactored DrillPushProjectIntoScanRule to handle item star fields, factored out helper classes and methods from PreUitl.class.

3. Fixed issue with dynamic star usage (after Calcite upgrade old usage of star was still present, replaced WILDCARD -> DYNAMIC_STAR for clarity).

4. Added unit tests to check project / filter push down and directory pruning with item star.

  1. … 26 more files in changeset.
DRILL-6049: Misc. hygiene and code cleanup changes

close apache/drill#1085

  1. … 123 more files in changeset.
DRILL-5919: Add non-numeric support for JSON processing

1. Added two session options store.json.reader.non_numeric_numbers and store.json.reader.non_numeric_numbers that allow to read/write NaN and Infinity as numbers. By default these options

are set to true.

2. Extended signature of convert_toJSON and convert_fromJSON functions by adding second optional parameter

that enables/disables read/write NaN and Infinity. By default it is set true.

3. Added unit tests with nan, infitity values for math and aggregate functions

4. Replaced JsonReader's constructors with builder.

This closes #1026

  1. … 16 more files in changeset.
DRILL-5864: Selecting a non-existing field from a MapR-DB JSON table fails with NPE.

    • -0
    • +94
    ./JsonReaderUtils.java
  1. … 4 more files in changeset.
DRILL-5355: Misc. code cleanup closes #784

  1. … 23 more files in changeset.
DRILL-3562: Query fails when using flatten on JSON data where some documents have an empty array

closes #713

  1. … 2 more files in changeset.
DRILL-4653: Malformed JSON should not stop the entire query from progressing

This closes #518

  1. … 8 more files in changeset.
DRILL-4479: For empty fields under all_text_mode enabled (a) use varchar for the default columns and (b) ensure we create fields corresponding to all columns.

close apache/drill#420

  1. … 3 more files in changeset.
DRILL-4184: Support variable length decimal fields in parquet

  1. … 68 more files in changeset.
DRILL-4382: Remove dependency on drill-logical from vector package

  1. … 80 more files in changeset.
DRILL-2288: Fix ScanBatch violation of IterOutcome protocol and downstream chain of bugs.

Increments:

2288: Pt. 1 Core: Added unit test. [Drill2288GetColumnsMetadataWhenNoRowsTest, empty.json]

2288: Pt. 1 Core: Changed HBase test table #1's # of regions from 1 to 2. [HBaseTestsSuite]

Also added TODO(DRILL-3954) comment about # of regions.

2288: Pt. 2 Core: Documented IterOutcome much more clearly. [RecordBatch]

Also edited some related Javadoc.

2288: Pt. 2 Hyg.: Edited doc., added @Override, etc. [AbstractRecordBatch, RecordBatch]

Purged unused SetupOutcome.

Added @Override.

Edited comments.

Fix some comments to doc. comments.

2288: Pt. 3 Core&Hyg.: Added validation of IterOutcome sequence. [IteratorValidatorBatchIterator]

Also:

Renamed internal members for clarity.

Added comments.

2288: Pt. 4 Core: Fixed a NONE -> OK_NEW_SCHEMA in ScanBatch.next(). [ScanBatch]

(With nearby comments.)

2288: Pt. 4 Hyg.: Edited comments, reordered, whitespace. [ScanBatch]

Reordered

Added comments.

Aligned.

2288: Pt. 4 Core+: Fixed UnionAllRecordBatch to receive IterOutcome sequence right. (3659) [UnionAllRecordBatch]

2288: Pt. 5 Core: Fixed ScanBatch.Mutator.isNewSchema() to stop spurious "new schema" reports (fix short-circuit OR, to call resetting method right). [ScanBatch]

2288: Pt. 5 Hyg.: Renamed, edited comments, reordered. [ScanBatch, SchemaChangeCallBack, AbstractSingleRecordBatch]

Renamed getSchemaChange -> getSchemaChangedAndReset.

Renamed schemaChange -> schemaChanged.

Added doc. comments.

Aligned.

2288: Pt. 6 Core: Avoided dummy Null.IntVec. column in JsonReader when not needed (MapWriter.isEmptyMap()). [JsonReader, 3 vector files]

2288: Pt. 6 Hyg.: Edited comments, message. Fixed message formatting. [RecordReader, JSONFormatPlugin, JSONRecordReader, AbstractMapVector, JsonReader]

Fixed message formatting.

Edited comments.

Edited message.

Fixed spurious line break.

2288: Pt. 7 Core: Added column families in HBaseRecordReader* to avoid dummy Null.IntVec. clash. [HBaseRecordReader]

2288: Pt. 8 Core.1: Cleared recordCount in OrderedPartitionRecordBatch.innerNext(). [OrderedPartitionRecordBatch]

2288: Pt. 8 Core.2: Cleared recordCount in ProjectRecordBatch.innerNext. [ProjectRecordBatch]

2288: Pt. 8 Core.3: Cleared recordCount in TopNBatch.innerNext. [TopNBatch]

2288: Pt. 9 Core: Had UnorderedReceiverBatch reset RecordBatchLoader's record count. [UnorderedReceiverBatch, RecordBatchLoader]

2288: Pt. 9 Hyg.: Added comments. [RecordBatchLoader]

2288: Pt. 10 Core: Worked around mismatched map child vectors in MapVector.getObject(). [MapVector]

2288: Pt. 11 Core: Added OK_NEW_SCHEMA schema comparison for HashAgg. [HashAggTemplate]

2288: Pt. 12 Core: Fixed memory leak in BaseTestQuery's printing.

Fixed bad skipping of RecordBatchLoader.clear(...) and

QueryDataBatch.load(...) for zero-row batches in printResult(...).

Also, dropped suppression of call to

VectorUtil.showVectorAccessibleContent(...) (so zero-row batches are

as visible as others).

2288: Pt. 13 Core: Fixed test that used unhandled periods in column alias identifiers.

2288: Misc.: Added # of rows to showVectorAccessibleContent's output. [VectorUtil]

2288: Misc.: Added simple/partial toString() [VectorContainer, AbstractRecordReader, JSONRecordReader, BaseValueVector, FieldSelection, AbstractBaseWriter]

2288: Misc. Hyg.: Added doc. comments to VectorContainer. [VectorContainer]

2288: Misc. Hyg.: Edited comment. [DrillStringUtils]

2288: Misc. Hyg.: Clarified message for unhandled identifier containing period.

2288: Pt. 3 Core&Hyg. Upd.: Added schema comparison result to logging. [IteratorValidatorBatchIterator]

2288: Pt. 7 Core Upd.: Handled HBase columns too re NullableIntVectors. [HBaseRecordReader, TestTableGenerator, TestHBaseFilterPushDown]

Created map-child vectors for requested columns.

Added unit test method testDummyColumnsAreAvoided, adding new row to test table,

updated some row counts.

2288: Pt. 7 Hyg. Upd.: Edited comment. [HBaseRecordReader]

2288: Pt. 11 Core Upd.: REVERTED all of bad OK_NEW_SCHEMA schema comparison for HashAgg. [HashAggTemplate]

This reverts commit 0939660f4620c03da97f4e1bf25a27514e6d0b81.

2288: Pt. 6 Core Upd.: Added isEmptyMap override in new (just-rebased-in) PromotableWriter. [PromotableWriter]

Adjusted definition and default implementation of isEmptyMap (to handle MongoDB

storage plugin's use of JsonReader).

2288: Pt. 6 Hyg. Upd.: Purged old atLeastOneWrite flag. [JsonReader]

2288: Pt. 14: Disabled newly dying test testNestedFlatten().

  1. … 38 more files in changeset.
DRILL-3229: Miscellaneous Union-type fixes

closes #207

closes #180

  1. … 38 more files in changeset.
DRILL-3229: Implement Union type vector

  1. … 53 more files in changeset.
DRILL-3773: Fix Mongo FieldSelection

Mongo plugin was previously rewriting a complex (multi-level) column reference as a simple selection of the top level field.

This changeset does not change this behavior in terms of the filter sent to mongo, but it add the original selected column to the list that will be read in by the JSON reader once that data is returned from mongo.

What this means is that we will be requesting more data from mongo that necessary (as we were previously), but this will be leveraging the existing functionality in the JSON reader to grab only the sub-selection actually requested in the query. This allows for difficult schema changes to be avoided by projecting only columns without schema changes.

This also fixes and adds unit tests for FieldSelection that cause wrong results when selecting a nested column and its parent.

  1. … 7 more files in changeset.
DRILL-1942-hygiene

- Formatting

- @Overrides

- finals

- some AutoCloseable additions

- new isCancelled() abstract method on FragmentManager, implemented on subclasses

Added missing new abstract method isCancelled()

Close apache/drill#120

  1. … 23 more files in changeset.
DRILL-2879: Part 2 - Enhancing extended json support for date in millies and binary with type info

Addressing review comments

Updated unit test to remove timezone that was being pulled from the local system

(and thus failed to match the baseline if run from a different timezone)

  1. … 3 more files in changeset.
DRILL-2879: Enhancing extended json support for date in millies and binary with type info

Ignore project push down Mongo test until test completes correctly on Linux

  1. … 4 more files in changeset.
DRILL-3476: Merge paths in FieldSelection

Conflicts:

exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java

  1. … 2 more files in changeset.
DRILL-3319: Replaced UserException#build() method with #build(Logger) method to log from the correct class

+ Fixed docs in UserException

+ Created loggers, and changed logger visibility to private

  1. … 37 more files in changeset.
DRILL-3019: In JsonReader, update atLeastOneWrite to true if writeListDataIfTyped and writeMapDataIfTyped write a value

  1. … 1 more file in changeset.
DRILL-2350: Improve exception handling and error messages in JSON reader.

  1. … 7 more files in changeset.