drill

Clone Tools
  • last updated 25 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7140: RM: Drillbits fail with "No enum constant org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit"

closes #1720

DRILL-7138: Implement command to describe schema for table

closes #1719

DRILL-7062: Initial implementation of run-time rowgroup pruning closes #1738

  1. … 10 more files in changeset.
DRILL-7064: Leverage the summary metadata for plain COUNT aggregates.

Add unit test

Modify MetadataDirectGroupScan to track summary file information and use in unit test.

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata.java

exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata_V4.java

Fix NPE for DrillTable to account for non-eligible tables.

Fix bug with direct scan after directory pruning. Add unit test.

Address review comments.

closes #1736

DRILL-7076: Fix NPE in StatsMaterializationVisitor

closes #1722

DRILL-7121: Use the NDV guess (same as before) when statistics is disabled

closes #1718

DRILL-7077: Add Function to Facilitate Time Series Analysis

closes #1680

DRILL-7032: Ignore corrupt rows in a PCAP file

closes #1637

    • binary
    /exec/java-exec/src/test/resources/store/pcap/testv1.pcap
DRILL-7096: Develop vector for canonical Map<K,V>

- Added new type DICT;

- Created value vectors for the type for single and repeated modes;

- Implemented corresponding FieldReaders and FieldWriters;

- Made changes in EvaluationVisitor to be able to read values from the map by key;

- Made changes to DrillParquetGroupConverter to be able to read Parquet's MAP type;

- Added an option `store.parquet.reader.enable_map_support` to disable reading MAP type as DICT from Parquet files;

- Updated AvroRecordReader to use new DICT type for Avro's MAP;

- Added support of the new type to ParquetRecordWriter.

  1. … 94 more files in changeset.
DRILL-7119: Compute range predicate selectivity using histograms.

Address code review comments. Add unit test for histogram usage.

close apache/drill#1733

DRILL-7110: Skip writing profile when an ALTER SESSION is executed (#1703)

Allows (by default) for `ALTER SESSION SET <option>=<value>` queries to NOT be writen to the profile store. This would avoid the risk of potentially adding up to a lot of profiles being written unnecessarily, since those changes are also reflected on the queries that follow.

Add public gpg key for Sorabh

Fixed IllegalStateException while reading Parquet data

DRILL-7126: Contrib format-ltsv is not being included in distribution (#1709)

    • -0
    • +1
    /distribution/src/assemble/component.xml
DRILL-7079: Drill can't query views from the S3 storage when plain authentication is enabled

closes #1712

DRILL-7125: REFRESH TABLE METADATA fails after upgrade from Drill 1.13.0 to Drill 1.15.0

DRILL-7118: Filter not getting pushed down on MapR-DB tables.

closes #1708

DRILL-7124: Fix logger for ShowFilesHandler

closes #1705

DRILL-7113: Fix creation of filter conditions for IS NULL and IS NOT NULL for MapR-DB format plugin

close apache/drill#1704

DRILL-7115: Improve Hive schema show tables performance

1. To make SHOW TABLES for Hive schema work much faster, additional Drill

feature of showing only accesible tables when Storage-Based authorization

is enabled was sacrificed. Now the behaviour matches to Hive/Beeline, all

tables will be shown despite of accessibility. For details about previous

show tables results, check description of DRILL-540.

2. In HiveDatabaseSchema implemented faster getTableNamesAndTypes() method

and removed bulk related code.

3. Deprecated bulk related options and removed bulk code from AbstractSchema,

DrillHiveMetastoreClient.

4. For 8000 Hive tables query returned in 1.8 seconds, for combination of

4000 tables and 8000 views query returned in 2.3 seconds. Note, that

after first query table names will be cached and next queries will perform

in less than 1 sec.

5. Refactored WorkspaceSchemaFactory's getTableNamesAndTypes()

method to reuse existing getViews() method.

6. DrillHiveMetastoreClient was refactored. Classes were unnested and enclosed

within client package with restricted visibility. Also was updated cache

values type to avoid unnecessarry List to Set back and forth conversions.

Client creation methods moved to separate class. So the new package

exposes only factory and client class.

closes #1706

  1. … 6 more files in changeset.
DRILL-7148: Use improved join cardinality and ndv estimation with statistics

closes #1744

DRILL-7155: Create a standard logging message for batch sizes generated by individual operators. This is needed for QA verification of the Batch Size feature DRILL-6238. closes #1716

DRILL-6965: Implement schema table function parameter

1. Added common schema table function parameter with can be used as single unit or with format plugin table function parameters.

2. Allowed creating schema without columns, in case if user needs only to indicate table properties.

3. Added unit tests.

closes #1777

  1. … 16 more files in changeset.
DRILL-7089: Implement caching for TableMetadataProvider at query level and adapt statistics to use Drill metastore API

closes #1728

  1. … 35 more files in changeset.
DRILL-7109: Apply selectivity calculations to single column filter predicates

closes #1701

DRILL-7111: Fix table function execution for directories

closes #1700

DRILL-7011: Support schema in scan framework

* Adds schema support to the row set-based scan framework and to the "V3" text reader based on that framework.

* Adding the schema made clear that passing options as a long list of constructor arguments was not sustainable. Refactored code to use a builder pattern instead.

* Added support for default values in the "null column loader", which required adding a "setValue" method to the column accessors.

* Added unit tests for all new or changed functionality. See TestCsvWithSchema for the overall test of the entire integrated mechanism.

* Added tests for explicit projection with schema

* Better handling of date/time in column accessors

* Converted recent column metadata work from Java 8 date/time to Joda.

* Added more CSV-with-schema unit tests

* Removed the ID fields from "resolved columns", used "instanceof" instead.

* Added wildcard projection with an output schema. Handles both "lenient" and "strict" schemas.

* Tagged projection columns with their output schema, when available.

* Scan projection added modes for wildcard with an output schema. The reader projection added support for merging reader and output schemas.

* Includes refactoring of scan operator tests (the test file grew too large.)

* Renamed some classes to avoid confusing reader schemas with output schemas.

* Added unit tests for the new functionality.

* Added "lenient" wildcard with schema test for CSV

* Added more type conversions: string-to-bit, many-to-string

* Fixed bug in column writer for VarDecimal

* Added missing unit tests, and fixed bugs, in Bit column reader/writer

* Cleaned up a number of unneded "SuppressWarnings"

closes #1711

  1. … 210 more files in changeset.
DRILL-7106: Fix Intellij warning for FieldSchemaNegotiator

closes #1698

DRILL-7105: Error while building the Drill native client

Added a compiler option in CMakeLists.txt to support the ISO C++ 2011 standard.

Also, changed the CMake min version to 3.1.3 to match the min version specified in protobuf.

closes #1697

DRILL-7107 Unable to connect to Drill 1.15 through ZK closes #1702