drill

Clone Tools
  • last updated 25 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-4456: Add Hive translate UDF

closes #1527

DRILL-6819: Remove invisible back link in Drill WebUI

DRILL-6691: Unify checkstyle-config.xml files.

closes #1550

  1. … 15 more files in changeset.
DRILL-6744: Support varchar and decimal push down

1. Added enableStringsSignedMinMax parquet format plugin config and store.parquet.reader.strings_signed_min_max session option to control reading binary statistics for files generated by prior versions of Parquet 1.10.0.

2. Added ParquetReaderConfig to store configuration needed during reading parquet statistics or files.

3. Provided mechanism to enable varchar / decimal filter push down.

4. Added VersionUtil to compare Drill versions in string representation.

5. Added appropriate unit tests.

closes #1537

  1. … 27 more files in changeset.
Fixed imports for DRILL-6381

DRILL-6760: Retain original exception in Verbose Error Message

closes #1519

DRILL-6642: Update protocol-buffers version

1. Updated protobuf to version 3.6.1

2. Added protobuf to the root pom dependency management

3. Added classes BoundedByteString and LiteralByteString for compatibility with HBase

4. Added ProtobufPatcher to provide compatibility with MapR-DB and HBase

closes #1639

    • -1824
    • +2364
    /contrib/native/client/src/protobuf/BitControl.pb.cc
    • -1034
    • +1724
    /contrib/native/client/src/protobuf/BitControl.pb.h
    • -786
    • +966
    /contrib/native/client/src/protobuf/BitData.pb.cc
    • -339
    • +667
    /contrib/native/client/src/protobuf/BitData.pb.h
    • -455
    • +581
    /contrib/native/client/src/protobuf/GeneralRPC.pb.cc
    • -245
    • +410
    /contrib/native/client/src/protobuf/GeneralRPC.pb.h
  1. … 26 more files in changeset.
edits to driver pages to include link to MapR Jdbc drivers

doc edit

doc edits: add driver links in 1.14 rn, edit plannerbroadcastthreshold option description

edit docs to add links to new mapr drill drivers, edit planner.broadcast_threshold option

DRILL-6809: Handle repeated map in schema inference

It turns out that the RowSet utilities build a repeated map without including the hidden $offsets$ vector in the metadata for the map. But, other parts in Drill do include this vector.

The RowSet behavior might be a bug which can be addressed in another PR.

This PR:

* Adds unit tests for map accessors at the row set level. Looks like these were never added originally. They are a simplified form of the ResultSetLoader map tests.

* Verified that the schema inference can infer a schema from a repeated map (using the RowSet style.)

* Added a test to reproduce the case from the bug.

* Made a tweak to the RowSetBuilder to allow access to the RowSetWriter which is needed by the new tests.

* Could of minor clean-ups.

closes #1513

DRILL-6715: Update descriptions for System Options table

With introduction of DRILL-5735 , the descriptions for about half the system options still remain missing. This commit collects descriptions review by @bbevens

1. Update options for HashAgg/Join (@Ben-Zvi )

2. Update options for Parquet Reader/Writer (@sachouche )

3. Update options for Planners (@HanumathRao , @vdiravka , @KazydubB )

4. Update options for BatchSizing (@bitblender )

5. Update options for Planner Optimizations (@arina-ielchiieva )

6. Update options for Security & Kafka (Krystal Nguyen)

7. Update options for Misc entries (@arina-ielchiieva , @vvysotskyi )

In additional, there is a patch for `org.apache.drill.exec.compile.ClassTransformer.scalar_replacement` , which appears to have replaced `exec.compile.scalar_replacement`. References to the latter have been removed to avoid confusion.

Additional changes include moving the `ClassTransformer` validator to `ExecConstants.java`

Adding support for internal options' descriptions

Removed mention of {{Will be removed in 1.15.0}}. (Refer DRILL-6527)

DRILL-6811: Fix type inference to return correct data mode for boolean functions

closes #1510

DRILL-6862: Update Calcite to 1.18.0

1. Moved Calcite dependency from profile hadoop-default to general dependency managment

2. Updated Calcite version to 1.18.0-drill-r0 and Avatica version to 1.13.0

3. Hook.REL_BUILDER_SIMPLIFY moved to static block, cause now it can't be removed (fixes DRILL-6830)

4. Removed WrappedAccessor, since it was workaround fixed in CALCITE-1408

5. Fixed setting of multiple options in TestBuilder

6. Timstampadd type inference aligned with CALCITE-2699

7. Dependency update caused 417 kB increase of jdb-all jar size, so the maxsize limit was

increased from 39.5 to 40 MB

8. Added test into TestDrillParquetReader to ensure that DRILL-6856 was

fixed by Calcite update

close apache/drill#1631

DRILL-6868: Upgrade Janino compiler to 3.0.11

- Remove workaround where removing adjacent ALOAD-POP instruction pairs

- Remove ModifiedUnparser and use DeepCopier for modifying methods instead of modifying it with custom Unparser implementation

closes #1553

DRILL-6844: Query with ORDER BY DESC on indexed column does not pick secondary index.

DRILL-6804: Simplify usage of OperatorPhase in HashAgg.

DRILL-6810: Disable NULL_IF_NULL NullHandling for functions with ComplexWriter closes #1509

DRILL-6084: Show Drill functions in WebUI for autocomplete

Building on top of DRILL-3988 and leveraging DRILL-5868, this allows support for Drill functions to be now available in the WebUI.

If users wants UDFs to show up, they should place the UDF jars in the `$DRILL_HOME/jars/3rdparty` directory so that this can be loaded during the Drillbit's startup.

Concept of internal Drill functions are introduced. With this, internal Drill functions like `ConvertToNullableXYZ` has been marked as internal.

The WebUI will not show these functions. However, they are still visible in `sys.functions` table with an additional column indicating that it is an internal function.

Tests have been added as a part of this commit to verify the internal functions concept.

DRILL-1328: Support table statistics

  1. … 38 more files in changeset.
DRILL-6793: FragmentExecutor cannot send its final state for the case when RootExec root wasn't initialized

closes #1506

DRILL-6792: Find the right probe side fragment wrapper & fix DrillBuf reference count bugs & tune the execution flow & support left deep tree

closes #1504

  1. … 14 more files in changeset.
DRILL-6381: Address code review comments (part 3).

DRILL-6381: Add missing joinControl logic for INTERSECT_DISTINCT.

- Modified HashJoin's probe phase to process INTERSECT_DISTINCT.

- NOTE: For build phase, the functionality will be same as for SemiJoin when it is added later.

DRILL-6381: Address code review comment for intersect_distinct.

DRILL-6381: Rebase on latest master and fix compilation issues.

DRILL-6381: Generate protobuf files for C++ native client.

DRILL-6381: Use shaded Guava classes. Add more comments and Javadoc.

  1. … 20 more files in changeset.
DRILL-6795: Upgrade Janino compiler from 2.7.6 to 3.0.10

closes #1503

DRILL-540: Allow querying hive views in Drill

1. Added DrillHiveViewTable which allows construction of DrillViewTable based

on Hive metadata

2. Added initialization of DrillHiveViewTable in HiveSchemaFactory

3. Extracted conversion of Hive data types from DrillHiveTable

to HiveToRelDataTypeConverter

4. Removed throwing of UnsupportedOperationException from HiveStoragePlugin

5. Added TestHiveViewsSupport and authorization tests

6. Added closeSilently() method to AutoCloseables

closes #1559

DRILL-6791: Scan projection framework

The "schema projection" mechanism:

* Handles none (SELECT COUNT\(*)), some (SELECT a, b, x) and all (SELECT *) projection.

* Handles null columns (for projection a column "x" that does not exist in the base table.)

* Handles constant columns as used for file metadata (AKA "implicit" columns).

* Handle schema persistence: the need to reuse the same vectors across different scanners

* Provides a framework for consuming externally-supplied metadata

* Since we don't yet have a way to provide "real" metadata, obtains metadata hints from

previous batches and from the projection list (a.b implies that "a" is a map, c[0]

implies that "c" is an array, etc.)

* Handles merging the set of data source columns and null columns to create the final output batch.

* Running tests found a failure due to an uninialized "bits" vector. Added code to explicitly fill

the bits vectors with zeros in the "result set loader."

  1. … 19 more files in changeset.
DRILL-3988: Expose Drill built-in functions & UDFs in a system table (#1483)

This commit exposes available SQL functions in Drill and also detects UDFs that have been dynamically loaded into Drill.

An example is shown below for 2 UDFs dynamically loaded into the cluster, along side the existing built-in functions that come with Drill.

```

0: jdbc:drill:schema=sys> select source, count(*) as functionCount from sys.functions group by source;

+-----------------------------------------+----------------+

| source | functionCount |

+-----------------------------------------+----------------+

| built-in | 2704 |

| simple-drill-function-1.0-SNAPSHOT.jar | 12 |

| drill-url-tools-1.0.jar | 1 |

+-----------------------------------------+----------------+

3 rows selected (0.209 seconds)

```

The system table exposes information as shown. The UDF is initialized, making the `returnType` available.

The `random(FLOAT8-REQUIRED,FLOAT8-REQUIRED)` function is an example of a UDF that has overloaded arguments (see `signature`).

The `url_parse(VARCHAR-REQUIRED)` function is another example of an initialized UDF.

Rest are built-in functions that meet the query's filter criteria.

```

0: jdbc:drill:schema=sys> select * from sys.functions where name like 'random' or name like '%url%';

+-------------+----------------------------------+-------------+-----------------------------------------+

| name | signature | returnType | source |

+-------------+----------------------------------+-------------+-----------------------------------------+

| parse_url | VARCHAR-REQUIRED | LATE | built-in |

| random | | FLOAT8 | built-in |

| random | FLOAT8-REQUIRED,FLOAT8-REQUIRED | FLOAT8 | simple-drill-function-1.0-SNAPSHOT.jar |

| url_decode | VARCHAR-REQUIRED | VARCHAR | built-in |

| url_encode | VARCHAR-REQUIRED | VARCHAR | built-in |

| url_parse | VARCHAR-REQUIRED | LATE | drill-url-tools-1.0.jar |

+-------------+----------------------------------+-------------+-----------------------------------------+

6 rows selected (0.619 seconds)

```

DRILL-6797: Fix UntypedNull handling for complex types

DRILL-6381: Address review comments (part 2): fix formatting issues and add javadoc.

  1. … 15 more files in changeset.