drill

Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-6374: Transitive Closure leads to TPCH Queries regressions and OOM when run concurrency test

- Test case for Directory Pruning with Transitive Predicates

- Improving description for STRICT_EQUAL_IS_DISTINCT_FROM predicate

closes #1262

DRILL-6402: Repeated Value Vectors copyFrom methods are not updating the value count and writer index correctly for values vector

DRILL-6356: batch sizing for union all

closes #1255

DRILL-6401: Precision for decimal data types may be lost for the case when cast with literal is used

close apache/drill#1254

DRILL-6353: Upgrade Parquet MR dependencies

closes #1259

  1. … 5 more files in changeset.
DRILL-6446: Support for EMIT outcome in TopN - Added comments for TopNBatch and PriorityQueueTemplate - Adding support for SchemaChange across next() call with HyperVector in incoming container. This is achieved by adding a new method in HyperVectorWrapper which just updates the vector[] array holding multiple vectors with provided input ValueVector array. And also modifying RemovingRecordBatch GenericSV4Copier to hold reference to VectorWrapper instead of ValueVector[] for each column in incoming batch - Handling empty batches. Two cases like empty batches in the begining with EMIT outcome and empty batches between consecutive EMIT outcome but after receiving some batches with data and EMIT outcome. Note: In first case of empty batch it was only returning EMIT outcome without properly creating the output container and SV4 vector. Because of that there could be a case where let's say first batch with EMIT outcome is empty then TopN will return an empty batch with SV mode NONE and if later batch comes with some records and EMIT outcome, that will generate output batch with OK_NEW_SCHEMA (since TopN always generate first output batch with records with OK_NEW_SCHEMA as it returns output with SV4 mode). Also let's consider both batch with EMIT outcome were produced after processing first 2 rows of an input batch. This is a problem as this is simulating schema change across rows of same incoming batch which will never be the case.

Note: In second case of empty batches priority queue will not be null and will be uninitialized. Also optimize to send EMIT outcome with output batch which has all the data to return for current iteration

rather than sending it with OK followed by empty batch with EMIT outcome.

closes #1293

DRILL-6386: Remove unused imports and star imports.

  1. … 217 more files in changeset.
DRILL-6386: Disallow unused imports and star imports.

    • -5
    • +2
    /src/main/resources/checkstyle-config.xml
DRILL-6255: Drillbit while sending control message to itself creates a connection instead of submitting locally

closes #1253

  1. … 4 more files in changeset.
DRILL-6389: Fixed building javadocs - Added documentation about how to build javadocs - Fixed some of the javadoc warnings

closes #1276

    • -0
    • +20
    /docs/dev/Javadocs.md
  1. … 51 more files in changeset.
DRILL-6272: Refactor dynamic UDFs and function initializer tests to generate needed binary and source jars at runtime

close apache/drill#1225

  1. … 15 more files in changeset.
DRILL-6363: Upgrade jmockit and mockito libs

DRILL-6420: Add Lateral and Unnest Keyword for highlighting on WebUI

closes #1261

DRILL-6380: Fix sporadic mongo db hangs.

closes #1249

DRILL-6242 Use java.time.Local{Date|Time|DateTime} for Drill Date, Time, Timestamp types. (#3)

close apache/drill#1247

* DRILL-6242 - Use java.time.Local{Date|Time|DateTime} classes to hold values from corresponding Drill date, time, and timestamp types.

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/ExtendedJsonOutput.java

Fix merge conflicts and check style.

  1. … 32 more files in changeset.
DRILL-6373:

- Adds code to return the proper vector type given the actual vector, adjusting metadata as needed.

- Refactor result set loader

- Revised projection & vector cache

closes #1244

  1. … 25 more files in changeset.
DRILL-6321: Lateral Join and Unnest - rules, options, logical plan supports

Included changes:

* Add planner.enable_unnest_lateral option. Default value set to false.

* Enable FilterCorrectRule

* Add support to logical plan

* Fix rebase errors for DRILL-6321 commits

  1. … 4 more files in changeset.
DRILL-6281: Introduce Collectors class for internal iterators

closes #1238

DRILL-6347: Change method names to "visitField".

Further change the method names to "visitField" with Vlad Rozov's comments.

closes #1236

DRILL-6361: Revised typeOf() function versions

Added more unit tests.

Updated to handle VARDECIMAL

The VARDECIMAL type was recently added to Drill. Added support for this type. The sqlTypeOf() function now returns DECIMAL(p, s) for precision p, scale s.

closes #1242

DRILL-6027: - Added fallback option for HashJoin. - No copy of incoming for single partition, and avoid HT resize. - Fix memory leak when cancelling while spill file is read - get correct schema when probe side is empty - Re-create the HashJoinProbe

  1. … 28 more files in changeset.
DRILL-6364: Handle Cluster Info in WebUI when existing/new bits restart

As a follow up to DRILL-6289, the following improvements have been done:

1. When loading the page for the first time, the WebUI enables the shutdown button without actually checking the state of the Drillbits.

The ideal behaviour should be to disable the button till the state is verified. [Done]

If a Drillbit is confirmed down (i.e. not in `/state` response), it is marked as OFFLINE and button is disabled.

2. When shutting down the current Drillbit, the WebUI no more has access to the cluster state.

The ideal behaviour here should be to fetch the state from any of the other Drillbits to update the status. [Done]

With the current Drillbit down, the other bits are requested for cluster state info and update accordingly.

3. When a new, previously unseen Drillbit comes up, the WebUI will never render it because the table is statically generated during the first page load.

The idea behaviour should be to append to the table on discovery of a new node. [Done]

The new Drillbit info is injected and a prompt appears to refresh the page to re-populate any missing info. This also works with feature (2) mentioned above.

The only Java code change was to have the state response carry the address and http-port as a tuple, instead of the user-port (which is never used).

Updates based on review

1. Added descriptive comments to functions

2. Handled possible race condition from multiple httpRequests for cluster state

3. Eliminated unused stateKey variable

4. Best practice of using local (let) instead of global (var) objects, and substituting currentRow variable instead of the jQuery search.

Additional changes based on review

1. Random selection of Drillbits to query for state when primary Drillbit is down (limited to 3; with a timeout of 3 sec)

2. Indicate when an Offline Drillbit is de-registered, versus just Offline as per Zookeeper

3. Hide shutdown column when Authentication is enabled, but the user is NOT an Admin.

When the column is visible, remote bits are disabled because

4. Metrics will be shown all the time (except HTTPS), because the information is available and a non-admin user would anyway not have actionable capabilities

5. Basic clean up on unused variables

Hide shutdown buttons for HTTPS scenarios

Fixed check for 'https'

Since, `location.protocol` returns a trailing ':' as well

Handle metrics lookup on local Drillbit for HTTPS

If the webpage is accessed via IP, the certificate is tied to the IP and not the FQDN, which is what is used for fetching the metrics.

close apache/drill#1241

DRILL-6345: DRILL Query fails on Function LOG10

- Added log10 function implementation

closes #1230

DRILL-143: CGroup Support for Drill-on-YARN

Initial patch

1. Minor fix up

2. Updated variable name for pid file

Revert changes to yarn-drillbit.sh

Based on discussion for PR #1239 (Ref: https://github.com/apache/drill/pull/1239)

closes #1239

    • -15
    • +15
    /distribution/src/resources/drillbit.sh
DRILL-6422: Update guava to 23.0 and shade it

- Fix compilation errors for new version of Guava.

- Remove usage of deprecated API

- Shade guava and add dependencies to the shaded version

- Ban unshaded package

- Introduce drill-shaded module and move guava-shaded under it

- Add methods to convert shaded guava lists to the unshaded ones

- Add instruction for publishing artifacts to the Apache repository

    • -0
    • +62
    /docs/dev/UpgradeGuava.md
    • -0
    • +72
    /drill-shaded/drill-shaded-guava/pom.xml
    • -0
    • +74
    /drill-shaded/pom.xml
  1. … 68 more files in changeset.
DRILL-6348: Received batches are now owned by the receive operators instead of the parent

closes #1237

DRILL-6281: Refactor TimedRunnable

DRILL-6281: Refactor TimedRunnable (rename TimedRunnable to TimedCallable)

DRILL-5927: Fixed memory leak in TestBsonRecordReader, and sped up the test.

closes #1234

DRILL-6333: Fixed Quotation marks

Initial step to making the source-code ready for Javadoc generation

This closes #1229