drill

Clone Tools
  • last updated 25 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
[maven-release-plugin] prepare release drill-1.10.0

    • -1
    • +1
    /contrib/data/tpch-sample-data/pom.xml
  1. … 13 more files in changeset.
DRILL-5326: Unit tests failures related to the SERVER_METADTA

- adding of the sql type name for the "GENERIC_OBJECT";

- changing "NullCollation" in the "ServerMetaProvider" to the correct default value;

- changing RpcType to GET_SERVER_META in the appropriate ServerMethod

close #775

DRILL-5313: Fix compilation issue in C++ connector

DRILL-5301 and DRILL-5167 have conflicting changes, which causes

the C++ connector to not compile: the static symbol for the search

escape string has been removed as the server might use a different one.

Fix the issue by using the current search escape string (injected from the

meta to the internal drill client when querying metadata).

close #769

Bump maxsize of jdbc-all jar to accommodate the increased size of jar file due to new code.

DRILL-5304: Queries fail intermittently when there is skew in data distribution

close #766

DRILL-5301: Add C++ client support for Server metadata API

Add support to the Server metadata API to the C++ client if

available. If the API is not supported to the server, fallback

to the previous hard-coded values.

Update the querySubmitter example program to query the information.

close #764

    • -623
    • +1057
    /contrib/native/client/src/clientlib/metadata.cpp
    • -135
    • +3709
    /contrib/native/client/src/protobuf/User.pb.cc
    • -3533
    • +6438
    /contrib/native/client/src/protobuf/User.pb.h
DRILL-5195: Publish Operator and MajorFragment Stats in Profile page

Improved UI

1. Introduction of Tooltips

2. Share of each operator as a percentages of the major fragment and of the query

- This would help identify the most CPU intensive operators within a fragment and across the query

3. Rows emitted by each operator

4. For a running query, changes to 'last update' and 'last progress' now shows the elapsed time since.

closes #756

DRILL-4280: CORE (revert DRILL-3242)

+ DRILL-3242 aims to provide offloading request handling to a secondary thread, but this feature is disabled by default due to concurrency issues

+ One of the implications of the feature was to ignore exceptions that were not of UserRpcException type. But exceptions must not be ignored, they should be handled properly, specially in the context of security

DRILL-5293: Change seed for distribution hash function to differ from that of the hash table

close #765

DRILL-5301: Server metadata API

Add a Server metadata API to the User protocol, to query server support

of various SQL features.

Add support to the client (DrillClient) to query this information.

Add support to the JDBC driver to query this information, if the server supports

the new API, or fallback to the previous behaviour (rely on Avatica defaults) otherwise.

close #764

  1. … 21 more files in changeset.
DRILL-5208: Finding path to java executable should be deterministic

See DRILL-5208 for background. Instead of using “find” to locate the

java command, we use the any information available, resorting to find

only if the “usual suspects” fails. The result is that we use the JDK

java when available, instead of randomly choosing JDK or JRE java.

close #763

DRILL-5284: Roll-up of final fixes for managed sort

See subtasks for details.

* Provide detailed, accurate estimate of size consumed by a record batch

* Managed external sort spills too often with Parquet data

* Managed External Sort fails with OOM

* External sort refers to the deprecated HDFS fs.default.name param

* Config param drill.exec.sort.external.batch.size is not used

* NPE in managed external sort while spilling to disk

* External Sort BatchGroup leaks memory if an OOM occurs during read

* DRILL-5294: Under certain low-memory conditions, need to force the sort to merge

two batches to make progress, even though this is a bit more than

comfortably fits into memory.

close #761

  1. … 8 more files in changeset.
DRILL-5274: Exception thrown in Drillbit shutdown in UDF cleanup code

closes #760

DRILL-4280: CORE (unit tests)

+ Modify existing tests to use new authentication configuration

+ Add TestUserBitKerberos and TestBitBitKerberos using Apache Kerby library

DRILL-4280: CORE (user to bit authentication, C++)

closes #578

    • -0
    • +49
    /contrib/native/client/cmakeModules/FindSASL.cmake
DRILL-5290: Provide an option to build operator table once for built-in static functions and reuse it across queries.

close #757

DRILL-5255: Remove default temporary workspace check at drillbit start up

closes #759

DRILL-5190: Display planning and queued time for a query's profile page

Modified UserSharedBit protobuf for marking planning and wait-in-queue end times. This will allow for accurately reporting the planning, queued and actual execution times of a query.

Planning Time:

In the absence of the planning time's end, for older profiles, the root fragment's (i.e. SCREEN operator) start time is taken as the estimated end of planning time, and as the estimated start time of the execution phase.

QueueWait Time:

We do not estimate the queue time if the planning end time is not available.

Execution Time:

We calculate the execution time based on the availability of these 2 planning time. The computation is done the following way, and reflects a decreasing level of accuracy

1. Execution time = [end(QueueWait) - endTime(Query)]

2. Execution time = [end(Planning) - endTime(Query)]

3. Execution time = [start(rootFragment) - endTime(Query)] - {Estimated}

closes #738

DRILL-5287: Provide option to skip updates of ephemeral state changes in Zookeeper

close #758

DRILL-5259: Allow listing a user-defined number of profiles

Allow changing default number of finished queries in web UI, when starting up Drillbits.

Option provided in drill-override.conf (default=100 ; defined in drill-module.conf)

Alternatively, the page can be loaded dynamically for the same.

e.g.

https://<hostname>:8047/profiles?max=100

closes #751

DRILL-5275: Sort spill is slow due to repeated allocations

Rather than create a heap buffer per vector when writing and reading,

the revised code creates a single, shared buffer used for all I/O

within a particular container. This improves performance by reducing GC

and CPU costs during I/Os.

Move I/O buffer, and methods to allocator

Allows the buffer to be shared. Especially in the sort, this is

important, as the sort may have many serializations open at once.

closes #754

DRILL-5260: Extend "Cluster Fixture" test framework

- Config option to suppress printing of CSV and other output. (Allows

printing for single tests, not printing when running from Maven.)

- Parsing of query profiles to extract plan and run time information.

- Fix bug in log fixture when enabling logging for a package.

- Improved ZK support.

- Set up the new CTTAS default temporary workspace for tests.

- Clean up persistent storage files on disk to avoid CTTAS startup

failures.

- Provides a set of examples for how to use the cluster fixture.

closes #753

  1. … 4 more files in changeset.
DRILL-5273: CompliantTextReader exhausts 4 GB memory when reading 5000 small files

Please see JIRA for details of problem and fix.

closes #750

DRILL-5266: Parquet returns low-density batches

Fixes one glaring problem related to bit/byte confusion.

Includes a few clean-up items found along the way.

Additional fixes from code review comments

More code clean up from code review

close #749

DRILL-5258: Access mock data definition from SQL

Extends the mock data source to allow using the full power of the mock

data source from an SQL query by referencing the JSON definition

file. See JIRA and package-info for details.

Adds a boolean data generator and a varying-length string generator.

Adds “mock” table stats for use in the planner.

Revisions based on code review comments

close #752

  1. … 8 more files in changeset.
DRILL-5263: Prevent left NLJoin with non scalar subqueries

DRILL-5257: Run-time control of query profiles

Adds a run-time option to save (default) or not save query profiles.

Adds a run-time option to save query profiles in "debug" mode:

that is, after returning the last client response. (Normal mode is

to return the response before writing the profile.)

Tests for normal case are normal unit tests. Tests for debug mode

case are unit tests using the new framework that parse profiles.

The test framework is extended to save query profiles using this

new option.

Modifies the test framework to use the new options when a test

asks to save query profiles.

closes #747

DRILL-5252: Fix a condition that always returns true

close #745

DRILL-5243: Fix TestContextFunctions.sessionIdUDFWithinSameSession unit test

This closes #743

DRILL-5040: Parquet writer unable to delete table folder on abort

close apache/drill#744