Clone Tools
  • last updated 15 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7502: Invalid codegen for typeof() with UNION

Also fixes DRILL-6362: typeof() reports NULL for primitive

columns with a NULL value.

typeof() is meant to return "NULL" if a UNION has a NULL

value, but the column type when known, such as for non-UNION

columns.

Also fixes DRILL-7499: sqltypeof() function with an array returns

"ARRAY", not type. This was due to treating REPEATED like LIST.

Handling of the Union vector in code gen is problematic

with about three special cases. Existing code handled two

of the cases. This change handles the third case.

Figuring out the change required poking around quite a bit

of unclear code. Added comments and restructuring to make

that code a bit more clear.

The fix modified code gen for the Union Holder. It can now

"go back in time" to add the union reader at the point we

need it.

closes #1945

  1. … 54 more files in changeset.
DRILL-6832: Removes the old "unmanaged" external sort

When the "managed" external sort was implemented a couple

of years back, we retained the original "unmanaged" version

out of an abundance of caution. The new version is now

battle tested and it is time to retire the original one.

closes #1929

    • -0
    • +133
    ./SortTestUtilities.java
    • -0
    • +377
    ./TestCopier.java
    • -0
    • +191
    ./TestExternalSortExec.java
    • -0
    • +726
    ./TestExternalSortInternals.java
    • -0
    • +196
    ./TestLenientAllocation.java
    • -0
    • +116
    ./TestShortArrays.java
    • -0
    • +727
    ./TestSortEmitOutcome.java
    • -0
    • +626
    ./TestSortImpl.java
    • -0
    • +658
    ./TestSorter.java
    • -133
    • +0
    ./managed/SortTestUtilities.java
    • -191
    • +0
    ./managed/TestExternalSortExec.java
  1. … 50 more files in changeset.
DRILL-7350: Move RowSet related classes from test folder

  1. … 286 more files in changeset.
DRILL-7310: Move schema-related classes from exec module to be able to use them in metastore module

closes #1816

  1. … 98 more files in changeset.
DRILL-7011: Support schema in scan framework

* Adds schema support to the row set-based scan framework and to the "V3" text reader based on that framework.

* Adding the schema made clear that passing options as a long list of constructor arguments was not sustainable. Refactored code to use a builder pattern instead.

* Added support for default values in the "null column loader", which required adding a "setValue" method to the column accessors.

* Added unit tests for all new or changed functionality. See TestCsvWithSchema for the overall test of the entire integrated mechanism.

* Added tests for explicit projection with schema

* Better handling of date/time in column accessors

* Converted recent column metadata work from Java 8 date/time to Joda.

* Added more CSV-with-schema unit tests

* Removed the ID fields from "resolved columns", used "instanceof" instead.

* Added wildcard projection with an output schema. Handles both "lenient" and "strict" schemas.

* Tagged projection columns with their output schema, when available.

* Scan projection added modes for wildcard with an output schema. The reader projection added support for merging reader and output schemas.

* Includes refactoring of scan operator tests (the test file grew too large.)

* Renamed some classes to avoid confusing reader schemas with output schemas.

* Added unit tests for the new functionality.

* Added "lenient" wildcard with schema test for CSV

* Added more type conversions: string-to-bit, many-to-string

* Fixed bug in column writer for VarDecimal

* Added missing unit tests, and fixed bugs, in Bit column reader/writer

* Cleaned up a number of unneded "SuppressWarnings"

closes #1711

  1. … 219 more files in changeset.
DRILL-7007: Use verify method in row set tests

Many of the early RowSet-based tests used the pattern:

new RowSetComparison(expected)

.verifyAndClearAll(result);

Revise this to use the simplified form:

RowSetUtilities.verify(expected, result);

The original form is retained when tests use additional functionality, such as the ability to perform multiple verifications on the same expected batch.

closes #1624

  1. … 8 more files in changeset.
DRILL-6901: Move schema builder to src/main

Moves the SchemaBuilder class out of the src/test name space into the src/main namespace. Specifically, into the existing record.metadata package.

Many files changed in this move. Corrected two minor issues: import of the wrong Arrays class and unnecessary annotations.

  1. … 84 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

  1. … 980 more files in changeset.
DRILL-6498: Support for EMIT outcome in ExternalSortBatch

* DRILL-6498: Support for EMIT outcome in ExternalSortBatch

* Updated TestTopNEmitOutcome to use RowSetComparison for comparing expected and actual output batches produced

closes #1323

    • -0
    • +728
    ./managed/TestSortEmitOutcome.java
  1. … 7 more files in changeset.
DRILL-6496: Added print methods for debugging tests, and fixed missing log statement in VectorUtils.

closes #1336

  1. … 33 more files in changeset.
DRILL-6438: Remove excess logging form the tests. - Removed usages of System.out and System.err from the test and replaced with loggers

closes #1284

  1. … 90 more files in changeset.
DRILL-6386: Remove unused imports and star imports.

    • -1
    • +0
    ./managed/TestExternalSortInternals.java
  1. … 229 more files in changeset.
DRILL-6027: - Added fallback option for HashJoin. - No copy of incoming for single partition, and avoid HT resize. - Fix memory leak when cancelling while spill file is read - get correct schema when probe side is empty - Re-create the HashJoinProbe

    • -5
    • +9
    ./managed/TestExternalSortInternals.java
  1. … 41 more files in changeset.
DRILL-6320: Fixed license headers.

closes #1207

  1. … 2066 more files in changeset.
DRILL-6094: Decimal data type enhancements

Add ExprVisitors for VARDECIMAL

Modify writers/readers to support VARDECIMAL

- Added usage of VarDecimal for parquet, hive, maprdb, jdbc;

- Added options to store decimals as int32 and int64 or fixed_len_byte_array or binary;

Add UDFs for VARDECIMAL data type

- modify type inference rules

- remove UDFs for obsolete DECIMAL types

Enable DECIMAL data type by default

Add unit tests for DECIMAL data type

Fix mapping for NLJ when literal with non-primitive type is used in join conditions

Refresh protobuf C++ source files

Changes in C++ files

Add support for decimal logical type in Avro.

Add support for date, time and timestamp logical types.

Update Avro version to 1.8.2.

  1. … 200 more files in changeset.
DRILL-6162: Enhance record batch sizer to retain nesting information. Refactor record batch sizer and add unit tests for sizing and vector allocation.

  1. … 5 more files in changeset.
DRILL-6210: Enhanced test schema utilities

closes #1150

  1. … 50 more files in changeset.
DRILL-6180: Use System Option "output_batch_size" for External Sort

closes #1129

    • -43
    • +45
    ./managed/TestExternalSortInternals.java
  1. … 3 more files in changeset.
DRILL-6027: - Added memory claculator - Added unit tests and docs. - Fixed IOB caused by output vector allocation. - Don't double count records that were spilled in HashJoin

  1. … 54 more files in changeset.
DILL-6148: TestSortSpillWithException is sometimes failing.

closes #1120

DRILL-6138: Move RecordBatchSizer to org.apache.drill.exec.record package

This closes #1115

  1. … 10 more files in changeset.
DRILL-6114: Metadata revisions

Support for union vectors, list vectors, repeated list vectors. Refactored metadata classes.

closes #1112

  1. … 72 more files in changeset.
DRILL-5730: Mock testing improvements and interface improvements

closes #1045

  1. … 223 more files in changeset.
DRILL-6080: Sort incorrectly limits batch size to 65535 records

closes #1090

* Sort incorrectly limits batch size to 65535 records rather than 65536.

* This PR also includes a few code cleanup items.

* Fix for overflow in offset vector in row set writer

* Performance tool update

* Replace "unsafe" methods with "set" methods

* Also fixes an indexing issue with nullable writers

* Removed debug & timing code

* Increase strictness for batch size

  1. … 10 more files in changeset.
DRILL-6049: Misc. hygiene and code cleanup changes

close apache/drill#1085

    • -21
    • +25
    ./managed/SortTestUtilities.java
    • -20
    • +20
    ./managed/TestExternalSortInternals.java
  1. … 117 more files in changeset.
DRILL-6030: Managed sort should minimize number of batches in a k-way merge

This closes #1075

    • -7
    • +5
    ./managed/TestExternalSortInternals.java
  1. … 3 more files in changeset.
DRILL-5989: Categories some tests to speed up smoke tests. Made travis run tests.

closes #1053

  1. … 31 more files in changeset.
DRILL-5783, DRILL-5841, DRILL-5894: Rationalize test temp directories

This change includes:

DRILL-5783:

- A unit test is created for the priority queue in the TopN operator.

- The code generation classes passed around a completely unused function registry reference in some places so it is removed.

- The priority queue had unused parameters for some of its methods so it is removed.

DRILL-5841:

- Created standardized temp directory classes DirTestWatcher, SubDirTestWatcher, and BaseDirTestWatcher. And updated all unit tests to use them.

DRILL-5894:

- Removed the dfs_test storage plugin for tests and replaced it with the already existing dfs storage plugin.

Misc:

- General code cleanup.

- Removed unnecessary use of String.format in the tests.

This closes #984

  1. … 363 more files in changeset.
DRILL-5842: Refactor fragment, operator contexts

This closes #978

  1. … 35 more files in changeset.
DRILL-5832: Change OperatorFixture to use system option manager

- Rename FixtureBuilder to ClusterFixtureBuilder

- Provide alternative way to reset system/session options

- Fix for DRILL-5833: random failure in TestParquetWriter

- Provide strict, but clear, errors for missing options

closes #970

  1. … 48 more files in changeset.