drill

Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7154: TPCH query 4, 17 and 18 take longer with sf 1000 when Statistics are disabled.

closes #1737

DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

This introduces support for JDBC's Statement.setMaxRows(int) API, which can help Drill execute a query much faster if it knows that not ALL the records in the resultset will be consumed upfront.

This Commit introduces the core changes to support the feature within Drill's execution engine

Protobuf Changes

1. RunQuery: Added "autolimit_rowcount"

2. QueryProfile: Added "autoLimit"

3. Regenerated Java and C++ client files

REST API support

1. Support for REST server to interpret a submitted query and also for rendering this information for an executed query

2. Updates to the Freemarker templates (for WebUI)

3. Safety check within Javascript (for WebUI)

JDBC API support

1. Introduces backend execution of 'ALTER SESSION' to apply the auto-limiting of resultset size

2. Added Unit Tests for PreparedStatement and Statement objects

3. Added getter setter methods to be skipped in testing for org.apache.drill.jdbc.test.Drill2489CallsAfterCloseThrowExceptionsTest.testclosedPreparedStmtOfOpenConnMethodsThrowRight()

Updates based on review comments

Additional Updates

Test Cleanup

1. Revert Drill2489 hack

2. Formatting in *StatementTest

3. Removal f redundant `statement.close()`

4. Manage new Exception thrown when setting invalid maxRow values

Final updates

1. Test changes

2. Trim trailing spaces in auto-limit value (Javascript)

3. Before & After annotations to synchronize changes to system values for MaxRows(auto-limit)

Reorganized tests due to synchronized locking

Removed conflicting JsonCreator in QueryWrapper

Additional test cleanup

closes #1714

    • -6
    • +40
    /contrib/native/client/src/protobuf/User.pb.h
  1. … 20 more files in changeset.
DRILL-7153: Drill Fails to Build using JDK 1.8.0_65 closes #1731

DRILL-7152: During histogram creation handle the case when all values of a column are NULLs.

close apache/drill#1730

DRILL-7045: Updates to address review comments

closes #7134

DRILL-7150: Fix timezone conversion for timestamp from MaprDB after the transition from PDT to PST closes #1729

DRILL-7146: Query failing with NPE when ZK queue is enabled.

DRILL-7145: Exceptions happened during retrieving values from ValueVector are not being displayed at the Drill Web UI closes #1727

DRILL-7143: Support default value for empty columns

Modifies the prior work to add default values for columns. The prior work added defaults

when the entire column is missing from a reader (the old Nullable Int column). The Row

Set mechanism now will also "fill empty" slots with the default value.

Added default support for the column writers. The writers automatically obtain the

default value from the column schema. The default can also be set explicitly on

the column writer.

Updated the null column mechanism to use this feature rather than the ad-hoc

implemention in the prior commit.

Semantics changed a bit. Only Required columns take a default. The default value

is ignored or nullable columns since nullable columns already have a file default: NULL.

Other changes:

* Updated the CSV-with-schema tests to illustrate the new behavior.

* Made multiple fixes for Boolean and Decimal columns and added unit tests.

* Upgraded Fremarker to version 2.3.28 to allow use of the continue statement.

* Reimplemented the Bit column reader and writer to use the BitVector directly since this vector is rather special.

* Added get/set Boolean methods for column accessors

* Moved the BooleanType class to the common package

* Added more CSV unit tests to explore decimal types, booleans, and defaults

* Add special handling for blank fields in from-string conversions

* Added options to the conversion factory to specify blank-handling behavior.

CSV uses a mapping of blanks to null (nullable) or default value (non-nullable)

closes #1726

  1. … 58 more files in changeset.
DRILL-7142: Add space after > in SqlLine prompt

closes #1721

DRILL-7140: RM: Drillbits fail with "No enum constant org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit"

closes #1720

DRILL-7138: Implement command to describe schema for table

closes #1719

DRILL-7062: Initial implementation of run-time rowgroup pruning closes #1738

  1. … 10 more files in changeset.
DRILL-7064: Leverage the summary metadata for plain COUNT aggregates.

Add unit test

Modify MetadataDirectGroupScan to track summary file information and use in unit test.

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata.java

exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata_V4.java

Fix NPE for DrillTable to account for non-eligible tables.

Fix bug with direct scan after directory pruning. Add unit test.

Address review comments.

closes #1736

DRILL-7076: Fix NPE in StatsMaterializationVisitor

closes #1722

DRILL-7121: Use the NDV guess (same as before) when statistics is disabled

closes #1718

DRILL-7077: Add Function to Facilitate Time Series Analysis

closes #1680

DRILL-7032: Ignore corrupt rows in a PCAP file

closes #1637

    • binary
    /exec/java-exec/src/test/resources/store/pcap/testv1.pcap
DRILL-7096: Develop vector for canonical Map<K,V>

- Added new type DICT;

- Created value vectors for the type for single and repeated modes;

- Implemented corresponding FieldReaders and FieldWriters;

- Made changes in EvaluationVisitor to be able to read values from the map by key;

- Made changes to DrillParquetGroupConverter to be able to read Parquet's MAP type;

- Added an option `store.parquet.reader.enable_map_support` to disable reading MAP type as DICT from Parquet files;

- Updated AvroRecordReader to use new DICT type for Avro's MAP;

- Added support of the new type to ParquetRecordWriter.

  1. … 94 more files in changeset.
DRILL-7119: Compute range predicate selectivity using histograms.

Address code review comments. Add unit test for histogram usage.

close apache/drill#1733

DRILL-7110: Skip writing profile when an ALTER SESSION is executed (#1703)

Allows (by default) for `ALTER SESSION SET <option>=<value>` queries to NOT be writen to the profile store. This would avoid the risk of potentially adding up to a lot of profiles being written unnecessarily, since those changes are also reflected on the queries that follow.

Add public gpg key for Sorabh

Fixed IllegalStateException while reading Parquet data

DRILL-7126: Contrib format-ltsv is not being included in distribution (#1709)

    • -0
    • +1
    /distribution/src/assemble/component.xml
DRILL-7079: Drill can't query views from the S3 storage when plain authentication is enabled

closes #1712

DRILL-7125: REFRESH TABLE METADATA fails after upgrade from Drill 1.13.0 to Drill 1.15.0

DRILL-7118: Filter not getting pushed down on MapR-DB tables.

closes #1708

DRILL-7124: Fix logger for ShowFilesHandler

closes #1705

DRILL-7113: Fix creation of filter conditions for IS NULL and IS NOT NULL for MapR-DB format plugin

close apache/drill#1704

DRILL-7115: Improve Hive schema show tables performance

1. To make SHOW TABLES for Hive schema work much faster, additional Drill

feature of showing only accesible tables when Storage-Based authorization

is enabled was sacrificed. Now the behaviour matches to Hive/Beeline, all

tables will be shown despite of accessibility. For details about previous

show tables results, check description of DRILL-540.

2. In HiveDatabaseSchema implemented faster getTableNamesAndTypes() method

and removed bulk related code.

3. Deprecated bulk related options and removed bulk code from AbstractSchema,

DrillHiveMetastoreClient.

4. For 8000 Hive tables query returned in 1.8 seconds, for combination of

4000 tables and 8000 views query returned in 2.3 seconds. Note, that

after first query table names will be cached and next queries will perform

in less than 1 sec.

5. Refactored WorkspaceSchemaFactory's getTableNamesAndTypes()

method to reuse existing getViews() method.

6. DrillHiveMetastoreClient was refactored. Classes were unnested and enclosed

within client package with restricted visibility. Also was updated cache

values type to avoid unnecessarry List to Set back and forth conversions.

Client creation methods moved to separate class. So the new package

exposes only factory and client class.

closes #1706

  1. … 6 more files in changeset.