Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7254: Read Hive union w/o nulls

  1. … 20 more files in changeset.
DRILL-7337: Add vararg UDFs support

    • -0
    • +49
    ./impl/CollectToListFunction.java
    • -4
    • +12
    ./interpreter/InterpreterEvaluator.java
    • -13
    • +19
    ./registry/LocalFunctionRegistry.java
  1. … 28 more files in changeset.
DRILL-7317: Close ClassLoaders used for udf jars uploading when closing FunctionImplementationRegistry

- Fix issue with caching DrillMergeProjectRule and FunctionImplementationRegistry when different drillbits are started within the same JVM

    • -14
    • +61
    ./registry/FunctionRegistryHolder.java
  1. … 1 more file in changeset.
DRILL-7315: Revise precision and scale order in the method arguments

    • -2
    • +2
    ./interpreter/InterpreterEvaluator.java
  1. … 28 more files in changeset.
DRILL-7307: casthigh for decimal type can lead to the issues with VarDecimalHolder

- Fixed code-gen for VarDecimal type

- Fixed code-gen issue with nullable holders for simple cast functions

with passed constants as arguments.

- Code-gen now honnoring DataType.Optional type defined by UDF for

NULL-IF-NULL functions.

    • -1
    • +2
    ./DrillComplexWriterAggFuncHolder.java
    • -3
    • +6
    ./output/DecimalReturnTypeInference.java
  1. … 4 more files in changeset.
DRILL-7253: Read Hive struct w/o nulls

    • -0
    • +57
    ./impl/RowConstructorFunction.java
  1. … 17 more files in changeset.
DRILL-7228: Upgrade to a newer version of t-digest to address inaccuracies in histogram buckets. closes #1774

  1. … 3 more files in changeset.
DRILL-7152: During histogram creation handle the case when all values of a column are NULLs.

close apache/drill#1730

    • -128
    • +192
    ./impl/TDigestFunctions.java
  1. … 1 more file in changeset.
DRILL-7143: Support default value for empty columns

Modifies the prior work to add default values for columns. The prior work added defaults

when the entire column is missing from a reader (the old Nullable Int column). The Row

Set mechanism now will also "fill empty" slots with the default value.

Added default support for the column writers. The writers automatically obtain the

default value from the column schema. The default can also be set explicitly on

the column writer.

Updated the null column mechanism to use this feature rather than the ad-hoc

implemention in the prior commit.

Semantics changed a bit. Only Required columns take a default. The default value

is ignored or nullable columns since nullable columns already have a file default: NULL.

Other changes:

* Updated the CSV-with-schema tests to illustrate the new behavior.

* Made multiple fixes for Boolean and Decimal columns and added unit tests.

* Upgraded Fremarker to version 2.3.28 to allow use of the continue statement.

* Reimplemented the Bit column reader and writer to use the BitVector directly since this vector is rather special.

* Added get/set Boolean methods for column accessors

* Moved the BooleanType class to the common package

* Added more CSV unit tests to explore decimal types, booleans, and defaults

* Add special handling for blank fields in from-string conversions

* Added options to the conversion factory to specify blank-handling behavior.

CSV uses a mapping of blanks to null (nullable) or default value (non-nullable)

closes #1726

  1. … 72 more files in changeset.
DRILL-7096: Develop vector for canonical Map<K,V>

- Added new type DICT;

- Created value vectors for the type for single and repeated modes;

- Implemented corresponding FieldReaders and FieldWriters;

- Made changes in EvaluationVisitor to be able to read values from the map by key;

- Made changes to DrillParquetGroupConverter to be able to read Parquet's MAP type;

- Added an option `store.parquet.reader.enable_map_support` to disable reading MAP type as DICT from Parquet files;

- Updated AvroRecordReader to use new DICT type for Avro's MAP;

- Added support of the new type to ParquetRecordWriter.

  1. … 107 more files in changeset.
DRILL-7011: Support schema in scan framework

* Adds schema support to the row set-based scan framework and to the "V3" text reader based on that framework.

* Adding the schema made clear that passing options as a long list of constructor arguments was not sustainable. Refactored code to use a builder pattern instead.

* Added support for default values in the "null column loader", which required adding a "setValue" method to the column accessors.

* Added unit tests for all new or changed functionality. See TestCsvWithSchema for the overall test of the entire integrated mechanism.

* Added tests for explicit projection with schema

* Better handling of date/time in column accessors

* Converted recent column metadata work from Java 8 date/time to Joda.

* Added more CSV-with-schema unit tests

* Removed the ID fields from "resolved columns", used "instanceof" instead.

* Added wildcard projection with an output schema. Handles both "lenient" and "strict" schemas.

* Tagged projection columns with their output schema, when available.

* Scan projection added modes for wildcard with an output schema. The reader projection added support for merging reader and output schemas.

* Includes refactoring of scan operator tests (the test file grew too large.)

* Renamed some classes to avoid confusing reader schemas with output schemas.

* Added unit tests for the new functionality.

* Added "lenient" wildcard with schema test for CSV

* Added more type conversions: string-to-bit, many-to-string

* Fixed bug in column writer for VarDecimal

* Added missing unit tests, and fixed bugs, in Bit column reader/writer

* Cleaned up a number of unneded "SuppressWarnings"

closes #1711

  1. … 224 more files in changeset.
DRILL-7092: Rename map to struct in schema definition 1. Renamed map to struct in schema parser. 2. Updated sqlTypeOf function to return STRUCT instead of MAP, drillTypeOf function will return MAP as before until internal renaming is done. 3. Add is_struct alias to already existing is_map function. Function should be revisited once Drill supports true maps. 4. Updated unit tests.

closes #1688

  1. … 6 more files in changeset.
DRILL-6524: Assign holder fields instead of assigning object references in generated code to allow scalar replacement for more cases closes #1686

  1. … 2 more files in changeset.
DRILL-7200: Update Calcite to 1.19.0 / 1.20.0

    • -0
    • +48
    ./impl/LastDayFunction.java
  1. … 45 more files in changeset.
DRILL-7045 UDF string_binary java.lang.IndexOutOfBoundsException

UDF string_binary was not reallocating the drillbuffer so it would fill

up and throw and out of bounds exception

  1. … 1 more file in changeset.
DRILL-4858: REPEATED_COUNT on an array of maps and an array of arrays is not implemented

- Implemented 'repeated_count' function for repeated MAP and repeated LIST;

- Updated RepeatedListReader and RepeatedMapReader implementations to return correct value from size() method

- Moved repeated_count to freemarker template and added support for more repeated types for the function

closes #1641

    • -147
    • +0
    ./impl/SimpleRepeatedFunctions.java
  1. … 8 more files in changeset.
DRILL-7117: Support creation of equi-depth histogram for selected data types.

Support int/bigint/float4/float8, time/timestamp/date and boolean.

Build the histogram from the t-digest byte array and serialize as JSON string.

More changes for serialization/deserialization.

Add code-gen stubs (empty) for VarChar/VarBinary types.

Address review comments (part 1). Add unit test.

Address review comments (part 2) for sampling.

close apache/drill#1715

    • -0
    • +1082
    ./impl/TDigestFunctions.java
  1. … 15 more files in changeset.
DRILL-7046: Support for loading and parsing new RM config file closes #1652

  1. … 63 more files in changeset.
DRILL-6533: Allow using literal values in functions which expect FieldReader instead of ValueHolder

closes #1617

    • -14
    • +36
    ./interpreter/InterpreterEvaluator.java
  1. … 2 more files in changeset.
DRILL-6868: Upgrade Janino compiler to 3.0.11

- Remove workaround where removing adjacent ALOAD-POP instruction pairs

- Remove ModifiedUnparser and use DeepCopier for modifying methods instead of modifying it with custom Unparser implementation

closes #1553

  1. … 3 more files in changeset.
DRILL-6810: Disable NULL_IF_NULL NullHandling for functions with ComplexWriter closes #1509

    • -0
    • +10
    ./DrillComplexWriterFuncHolder.java
    • -34
    • +134
    ./impl/ParseQueryFunction.java
    • -107
    • +145
    ./impl/ParseUrlFunction.java
    • -12
    • +85
    ./impl/conv/JsonConvertFrom.java
  1. … 6 more files in changeset.
DRILL-6084: Show Drill functions in WebUI for autocomplete

Building on top of DRILL-3988 and leveraging DRILL-5868, this allows support for Drill functions to be now available in the WebUI.

If users wants UDFs to show up, they should place the UDF jars in the `$DRILL_HOME/jars/3rdparty` directory so that this can be loaded during the Drillbit's startup.

Concept of internal Drill functions are introduced. With this, internal Drill functions like `ConvertToNullableXYZ` has been marked as internal.

The WebUI will not show these functions. However, they are still visible in `sys.functions` table with an additional column indicating that it is an internal function.

Tests have been added as a part of this commit to verify the internal functions concept.

  1. … 10 more files in changeset.
DRILL-1328: Support table statistics

    • -0
    • +285
    ./impl/StatisticsAggrFunctions.java
  1. … 52 more files in changeset.
DRILL-6795: Upgrade Janino compiler from 2.7.6 to 3.0.10

closes #1503

    • -0
    • +110
    ./ModifiedUnparser.java
  1. … 2 more files in changeset.
DRILL-3988: Expose Drill built-in functions & UDFs in a system table (#1483)

This commit exposes available SQL functions in Drill and also detects UDFs that have been dynamically loaded into Drill.

An example is shown below for 2 UDFs dynamically loaded into the cluster, along side the existing built-in functions that come with Drill.

```

0: jdbc:drill:schema=sys> select source, count(*) as functionCount from sys.functions group by source;

+-----------------------------------------+----------------+

| source | functionCount |

+-----------------------------------------+----------------+

| built-in | 2704 |

| simple-drill-function-1.0-SNAPSHOT.jar | 12 |

| drill-url-tools-1.0.jar | 1 |

+-----------------------------------------+----------------+

3 rows selected (0.209 seconds)

```

The system table exposes information as shown. The UDF is initialized, making the `returnType` available.

The `random(FLOAT8-REQUIRED,FLOAT8-REQUIRED)` function is an example of a UDF that has overloaded arguments (see `signature`).

The `url_parse(VARCHAR-REQUIRED)` function is another example of an initialized UDF.

Rest are built-in functions that meet the query's filter criteria.

```

0: jdbc:drill:schema=sys> select * from sys.functions where name like 'random' or name like '%url%';

+-------------+----------------------------------+-------------+-----------------------------------------+

| name | signature | returnType | source |

+-------------+----------------------------------+-------------+-----------------------------------------+

| parse_url | VARCHAR-REQUIRED | LATE | built-in |

| random | | FLOAT8 | built-in |

| random | FLOAT8-REQUIRED,FLOAT8-REQUIRED | FLOAT8 | simple-drill-function-1.0-SNAPSHOT.jar |

| url_decode | VARCHAR-REQUIRED | VARCHAR | built-in |

| url_encode | VARCHAR-REQUIRED | VARCHAR | built-in |

| url_parse | VARCHAR-REQUIRED | LATE | drill-url-tools-1.0.jar |

+-------------+----------------------------------+-------------+-----------------------------------------+

6 rows selected (0.619 seconds)

```

    • -0
    • +13
    ./FunctionImplementationRegistry.java
    • -0
    • +34
    ./registry/FunctionRegistryHolder.java
  1. … 7 more files in changeset.
DRILL-6797: Fix UntypedNull handling for complex types

    • -0
    • +73
    ./impl/CompareUntypedNull.java
  1. … 13 more files in changeset.
DRILL-6768: Improve to_date, to_time and to_timestamp and corresponding cast functions to handle empty string when option is enabled closes #1494

  1. … 23 more files in changeset.
DRILL-6762: Fix dynamic UDFs versioning issue

1. Added UndefinedVersionDelegatingStore to serve as versioned wrapper for those stores that do not support versioning.

2. Aligned remote and local function registries version type. Type will be represented as int since ZK version is returned as int.

3. Added NOT_AVAILABLE and UNDEFINED versions to DataChangeVersion holder to indicate proper registry state.

4. Added additional trace logging.

5. Minor refactoring and clean up.

closes #1484

    • -18
    • +33
    ./FunctionImplementationRegistry.java
    • -29
    • +26
    ./registry/FunctionRegistryHolder.java
    • -8
    • +12
    ./registry/LocalFunctionRegistry.java
    • -20
    • +23
    ./registry/RemoteFunctionRegistry.java
  1. … 18 more files in changeset.
DRILL-6763: Codegen optimization of SQL functions with constant values(#1481)

closes #1481

    • -3
    • +3
    ./interpreter/InterpreterEvaluator.java
  1. … 19 more files in changeset.
DRILL-6717: lower and upper functions not works with national characters

closes #1450

  1. … 1 more file in changeset.