Clone Tools
  • last updated 19 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7337: Add vararg UDFs support

    • -0
    • +325
    ./impl/TestVarArgFunctions.java
    • -0
    • +126
    ./impl/testing/CountArgumentsAggFunctions.java
    • -0
    • +90
    ./impl/testing/CountArgumentsFunctions.java
    • -0
    • +65
    ./impl/testing/InvalidVarargFunctions.java
    • -0
    • +42
    ./impl/testing/VarArgAddFunction.java
    • -0
    • +275
    ./impl/testing/VarCharConcatFunctions.java
  1. … 32 more files in changeset.
DRILL-7315: Revise precision and scale order in the method arguments

  1. … 27 more files in changeset.
DRILL-7310: Move schema-related classes from exec module to be able to use them in metastore module

closes #1816

  1. … 102 more files in changeset.
DRILL-7297: Query hangs in planning stage when Error is thrown

close apache/drill#1811

    • -0
    • +42
    ./impl/testing/CustomErrorFunction.java
  1. … 2 more files in changeset.
DRILL-7253: Read Hive struct w/o nulls

  1. … 17 more files in changeset.
DRILL-6951: Merge row set based mock data source

The mock data source is used in several tests to generate a large volume

of sample data, such as when testing spilling. The mock data source also

lets us try new plugin featues in a very simple context. During the

development of the row set framework, the mock data source was converted

to use the new framework to verify functionality. This commit upgrades

the mock data source with that work.

The work changes non of the functionality. It does, however, improve

memory usage. Batchs are limited, by default, to 10 MB in size. The row

set framework minimizes internal fragmentation in the largest vector.

(Previously, internal fragmentation averaged 25% but could be as high as

50%.)

As it turns out, the hash aggregate tests depended on the internal

fragmentation: without it, the hash agg no longer spilled for the same

row count. Adjusted the generated row counts to recreate a data volume

that caused spilling.

One test in particular always failed due to assertions in the hash agg

code. These seem true bugs and are described in DRILL-7301. After

multiple failed attempts to get the test to work, it ws disabled until

DRILL-7301 is fixed.

Added a new unit test to sanity check the mock data source. (No test

already existed for this functionality except as verified via other unit

tests.)

    • -5
    • +4
    ./interp/ExpressionInterpreterTest.java
  1. … 21 more files in changeset.
DRILL-7237: Fix single_value aggregate function for variable length types

- Add implementations of single_value for complex data types

closes #1782

    • -5
    • +110
    ./impl/TestAggregateFunctions.java
  1. … 12 more files in changeset.
DRILL-4782 / DRILL-7139: Fix DATE_ADD and TO_TIME functions

- cast function for the day interval changed to round milliseconds to complete days

- ToDateTypeFunctions#toTime now returning milliseconds of day

- updated the way how DayInterval subtracts and adds, to follow the cast function logic

UT core updates:

- added vectorValue function to the queryBuilder to simplify retrieving value of the vector

- refactored singleton query result functions at queryBuilder

    • -0
    • +12
    ./impl/testing/TestDateConversions.java
  1. … 5 more files in changeset.
DRILL-7143: Support default value for empty columns

Modifies the prior work to add default values for columns. The prior work added defaults

when the entire column is missing from a reader (the old Nullable Int column). The Row

Set mechanism now will also "fill empty" slots with the default value.

Added default support for the column writers. The writers automatically obtain the

default value from the column schema. The default can also be set explicitly on

the column writer.

Updated the null column mechanism to use this feature rather than the ad-hoc

implemention in the prior commit.

Semantics changed a bit. Only Required columns take a default. The default value

is ignored or nullable columns since nullable columns already have a file default: NULL.

Other changes:

* Updated the CSV-with-schema tests to illustrate the new behavior.

* Made multiple fixes for Boolean and Decimal columns and added unit tests.

* Upgraded Fremarker to version 2.3.28 to allow use of the continue statement.

* Reimplemented the Bit column reader and writer to use the BitVector directly since this vector is rather special.

* Added get/set Boolean methods for column accessors

* Moved the BooleanType class to the common package

* Added more CSV unit tests to explore decimal types, booleans, and defaults

* Add special handling for blank fields in from-string conversions

* Added options to the conversion factory to specify blank-handling behavior.

CSV uses a mapping of blanks to null (nullable) or default value (non-nullable)

closes #1726

    • -1
    • +0
    ./impl/TestByteComparisonFunctions.java
  1. … 72 more files in changeset.
DRILL-7011: Support schema in scan framework

* Adds schema support to the row set-based scan framework and to the "V3" text reader based on that framework.

* Adding the schema made clear that passing options as a long list of constructor arguments was not sustainable. Refactored code to use a builder pattern instead.

* Added support for default values in the "null column loader", which required adding a "setValue" method to the column accessors.

* Added unit tests for all new or changed functionality. See TestCsvWithSchema for the overall test of the entire integrated mechanism.

* Added tests for explicit projection with schema

* Better handling of date/time in column accessors

* Converted recent column metadata work from Java 8 date/time to Joda.

* Added more CSV-with-schema unit tests

* Removed the ID fields from "resolved columns", used "instanceof" instead.

* Added wildcard projection with an output schema. Handles both "lenient" and "strict" schemas.

* Tagged projection columns with their output schema, when available.

* Scan projection added modes for wildcard with an output schema. The reader projection added support for merging reader and output schemas.

* Includes refactoring of scan operator tests (the test file grew too large.)

* Renamed some classes to avoid confusing reader schemas with output schemas.

* Added unit tests for the new functionality.

* Added "lenient" wildcard with schema test for CSV

* Added more type conversions: string-to-bit, many-to-string

* Fixed bug in column writer for VarDecimal

* Added missing unit tests, and fixed bugs, in Bit column reader/writer

* Cleaned up a number of unneded "SuppressWarnings"

closes #1711

    • -3
    • +0
    ./interp/ExpressionInterpreterTest.java
  1. … 224 more files in changeset.
DRILL-4858: REPEATED_COUNT on an array of maps and an array of arrays is not implemented

- Implemented 'repeated_count' function for repeated MAP and repeated LIST;

- Updated RepeatedListReader and RepeatedMapReader implementations to return correct value from size() method

- Moved repeated_count to freemarker template and added support for more repeated types for the function

closes #1641

    • -26
    • +299
    ./impl/TestNewSimpleRepeatedFunctions.java
  1. … 8 more files in changeset.
DRILL-6967: Fix TIMESTAMPDIFF function for QUARTER qualifier

closes #1609

    • -37
    • +43
    ./impl/TestTimestampAddDiffFunctions.java
  1. … 1 more file in changeset.
DRILL-6959: Fix loss of precision when casting time and timestamp literals in filter condition closes #1607

    • -48
    • +101
    ./impl/TestCastFunctions.java
  1. … 1 more file in changeset.
DRILL-6962: Function coalesce returns an Error when none of the columns in coalesce exist in a parquet file

- Updated UntypedNullVector to hold value count when vector is allocated and transfered to another one;

- Updated RecordBatchLoader and DrillCursor to handle case when only UntypedNull values are present in RecordBatch (special case when data buffer is null but actual values are present);

- Added functions to cast UntypedNull value to other types for use in UDFs;

- Moved UntypedReader, UntypedHolderReaderImpl and UntypedReaderImpl from org.apache.drill.exec.vector.complex.impl to org.apache.drill.exec.vector package.

closes #1614

  1. … 16 more files in changeset.
DRILL-3610: Add TIMESTAMPADD and TIMESTAMPDIFF functions

closes #1528

    • -0
    • +202
    ./impl/TestTimestampAddDiffFunctions.java
  1. … 6 more files in changeset.
DRILL-6810: Disable NULL_IF_NULL NullHandling for functions with ComplexWriter closes #1509

    • -0
    • +86
    ./impl/TestParseFunctions.java
  1. … 16 more files in changeset.
DRILL-6783: CAST string literal as INTERVAL MONTH/YEAR works inconsistently when selecting from a table with multiple rows

close apache/drill#1496

  1. … 1 more file in changeset.
DRILL-6768: Improve to_date, to_time and to_timestamp and corresponding cast functions to handle empty string when option is enabled closes #1494

    • -48
    • +103
    ./impl/TestCastEmptyStrings.java
    • -0
    • +95
    ./impl/testing/TestDateConversions.java
  1. … 23 more files in changeset.
DRILL-6684: Swap sys.options and sys.options_val tables

The current `sys.options` table has a verbose layout, because of which `sys.options_internal` was introduced. The latter also supports descriptions, which means it makes sense to have that table as the new `sys.options`.

This PR deprecates the old format, so that any dependencies continue to make use of it as long as required.

  1. … 7 more files in changeset.
DRILL-1248: Allow positional / named aliases in group by / having clauses

  1. … 3 more files in changeset.
DRILL-6710: Disallow negative scale for decimal data type

    • -6
    • +50
    ./impl/TestVarDecimalFunctions.java
  1. … 12 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

    • -1
    • +1
    ./interp/ExpressionInterpreterTest.java
  1. … 974 more files in changeset.
DRILL-6349: Drill JDBC driver fails on Java 1.9+ with NoClassDefFoundError: sun/misc/VM

closes #1446

    • -3
    • +0
    ./impl/testing/TestDateConversions.java
  1. … 20 more files in changeset.
DRILL-6656: Disallow extra semicolons and multiple statements on the same line.

closes #1415

    • -1
    • +1
    ./interp/ExpressionInterpreterTest.java
  1. … 142 more files in changeset.
DRILL-6634: Add udf module under contrib directory and move some udfs into it

1. Created new contrib/udf module.

2. Moved distance, phonetic, networking, crypto functions from java-exec to contrib/udf module.

3. Moved functions from gis module to contrib/udf module. Removed gis module.

4. Removed unnecessary dependencies from java-exec module.

5. Minor refactoring of moved functions code.

closes #1403

    • -133
    • +0
    ./impl/TestNetworkFunctions.java
    • -119
    • +0
    ./impl/TestPhoneticFunctions.java
    • -80
    • +0
    ./impl/TestStringDistanceFunctions.java
  1. … 90 more files in changeset.
DRILL-6472: Prevent using zero precision in CAST function

- Add check for the correctness of scale value;

- Add check for fitting the value to the value with the concrete scale and precision;

- Implement negative UDF for VarDecimal

- Add unit tests for new checks and UDF.

    • -45
    • +63
    ./impl/TestVarDecimalFunctions.java
  1. … 12 more files in changeset.
DRILL-6519: Add String Distance and Phonetic Functions

closes #1331

    • -0
    • +119
    ./impl/TestPhoneticFunctions.java
    • -0
    • +80
    ./impl/TestStringDistanceFunctions.java
  1. … 5 more files in changeset.
DRILL-6438: Remove excess logging form the tests. - Removed usages of System.out and System.err from the test and replaced with loggers

closes #1284

    • -1
    • +1
    ./interp/ExpressionInterpreterTest.java
  1. … 88 more files in changeset.
DRILL-5188: Expand sub-queries using rules

- Add check for agg with group by literal

- Allow NLJ for limit 1

- Implement single_value aggregate function

closes #1321

  1. … 19 more files in changeset.
DRILL-6386: Remove unused imports and star imports.

  1. … 231 more files in changeset.