Clone Tools
  • last updated 23 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7164: KafkaFilterPushdownTest is sometimes failing to pattern match correctly

closes #1760

    • -16
    • +32
    ./java/org/apache/drill/PlanTestBase.java
  1. … 1 more file in changeset.
DRILL-7183: TPCDS query 10, 35, 69 take longer with sf 1000 when Statistics are disabled. This commit reverts the changes done for DRILL-6997.

  1. … 5 more files in changeset.
DRILL-6988. Utility of the too long error message when syntax error

- Adding Drill wrapper around SqlparseException to customize produced by Calcite messages

- Fix Drill SQL parse exception formatter to calculate proper position for "^" character

closes #1753

  1. … 3 more files in changeset.
DRILL-6974: SET option command modification

- ALTER ... RESET ... and ALTER ... SET ... sub-parsers separated to 2

different SqlCall classes with same parent SqlSetOption

- parserImpls modified to handle new syntax of ALTER... SET...

expresion:

a) ALTER ... SET option.name - option.value - setting option value

b) ALTER ... SET option.name - display option value

- Handler for SqlSetOption separated to SetOptionHandler and

ResetOptionhandler for better representation of handled statements

- Base abstract class AbstractSqlSetHandler created to not repeat

shared implementation of same functions

- SetOptionHandler covered with unit tests for each statement

form.

Fix issues stated in the review

closes #1763

  1. … 9 more files in changeset.
DRILL-7167: Implemented DESCRIBE TABLE statement

- altered parser implementation to honor DESCRIBE TABLE syntax

- extended test coverage to check the new statement

closes #1747

  1. … 1 more file in changeset.
DRILL-7171: Create metadata directories cache file in the leaf level directories to support ConvertCountToDirectScan optimization. closes #1748

  1. … 1 more file in changeset.
DRILL-7166: Count query with wildcard should skip reading of metadata summary file

  1. … 1 more file in changeset.
DRILL-7159: Fix typeString method to return correct name for MAP (aka STRUCT) closes #1741

  1. … 2 more files in changeset.
DRILL-7049: REST API returns the toString of byte arrays (VARBINARY types)

closes #1739

  1. … 1 more file in changeset.
DRILL-7152: During histogram creation handle the case when all values of a column are NULLs.

close apache/drill#1730

  1. … 1 more file in changeset.
DRILL-7045: Updates to address review comments

closes #7134

DRILL-7146: Query failing with NPE when ZK queue is enabled.

  1. … 1 more file in changeset.
DRILL-7143: Support default value for empty columns

Modifies the prior work to add default values for columns. The prior work added defaults

when the entire column is missing from a reader (the old Nullable Int column). The Row

Set mechanism now will also "fill empty" slots with the default value.

Added default support for the column writers. The writers automatically obtain the

default value from the column schema. The default can also be set explicitly on

the column writer.

Updated the null column mechanism to use this feature rather than the ad-hoc

implemention in the prior commit.

Semantics changed a bit. Only Required columns take a default. The default value

is ignored or nullable columns since nullable columns already have a file default: NULL.

Other changes:

* Updated the CSV-with-schema tests to illustrate the new behavior.

* Made multiple fixes for Boolean and Decimal columns and added unit tests.

* Upgraded Fremarker to version 2.3.28 to allow use of the continue statement.

* Reimplemented the Bit column reader and writer to use the BitVector directly since this vector is rather special.

* Added get/set Boolean methods for column accessors

* Moved the BooleanType class to the common package

* Added more CSV unit tests to explore decimal types, booleans, and defaults

* Add special handling for blank fields in from-string conversions

* Added options to the conversion factory to specify blank-handling behavior.

CSV uses a mapping of blanks to null (nullable) or default value (non-nullable)

closes #1726

    • -1
    • +1
    ./java/org/apache/drill/test/ClusterTest.java
    • -0
    • +182
    ./java/org/apache/drill/test/rowSet/test/TestDummyWriter.java
  1. … 58 more files in changeset.
DRILL-7140: RM: Drillbits fail with "No enum constant org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit"

closes #1720

  1. … 1 more file in changeset.
DRILL-7138: Implement command to describe schema for table

closes #1719

    • -0
    • +112
    ./java/org/apache/drill/TestSchemaCommands.java
  1. … 6 more files in changeset.
DRILL-7062: Initial implementation of run-time rowgroup pruning closes #1738

  1. … 23 more files in changeset.
DRILL-7064: Leverage the summary metadata for plain COUNT aggregates.

Add unit test

Modify MetadataDirectGroupScan to track summary file information and use in unit test.

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata.java

exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata_V4.java

Fix NPE for DrillTable to account for non-eligible tables.

Fix bug with direct scan after directory pruning. Add unit test.

Address review comments.

closes #1736

  1. … 9 more files in changeset.
DRILL-7076: Fix NPE in StatsMaterializationVisitor

closes #1722

  1. … 2 more files in changeset.
DRILL-7032: Ignore corrupt rows in a PCAP file

closes #1637

    • binary
    ./resources/store/pcap/testv1.pcap
  1. … 3 more files in changeset.
DRILL-7096: Develop vector for canonical Map<K,V>

- Added new type DICT;

- Created value vectors for the type for single and repeated modes;

- Implemented corresponding FieldReaders and FieldWriters;

- Made changes in EvaluationVisitor to be able to read values from the map by key;

- Made changes to DrillParquetGroupConverter to be able to read Parquet's MAP type;

- Added an option `store.parquet.reader.enable_map_support` to disable reading MAP type as DICT from Parquet files;

- Updated AvroRecordReader to use new DICT type for Avro's MAP;

- Added support of the new type to ParquetRecordWriter.

    • -0
    • +459
    ./java/org/apache/drill/exec/record/vector/TestDictVector.java
    • -0
    • +35
    ./java/org/apache/drill/test/TestBuilder.java
    • binary
    ./resources/store/parquet/complex/simple_map.parquet
  1. … 95 more files in changeset.
DRILL-7119: Compute range predicate selectivity using histograms.

Address code review comments. Add unit test for histogram usage.

close apache/drill#1733

    • -2
    • +2
    ./java/org/apache/drill/PlanTestBase.java
  1. … 4 more files in changeset.
DRILL-7148: Use improved join cardinality and ndv estimation with statistics

closes #1744

  1. … 11 more files in changeset.
DRILL-6965: Implement schema table function parameter

1. Added common schema table function parameter with can be used as single unit or with format plugin table function parameters.

2. Allowed creating schema without columns, in case if user needs only to indicate table properties.

3. Added unit tests.

closes #1777

    • -1
    • +63
    ./java/org/apache/drill/TestSchemaCommands.java
    • -0
    • +249
    ./java/org/apache/drill/TestSchemaWithTableFunction.java
    • -38
    • +16
    ./java/org/apache/drill/TestSelectWithOption.java
  1. … 25 more files in changeset.
DRILL-7089: Implement caching for TableMetadataProvider at query level and adapt statistics to use Drill metastore API

closes #1728

  1. … 49 more files in changeset.
DRILL-7111: Fix table function execution for directories

closes #1700

    • -4
    • +35
    ./java/org/apache/drill/TestSelectWithOption.java
  1. … 1 more file in changeset.
DRILL-7011: Support schema in scan framework

* Adds schema support to the row set-based scan framework and to the "V3" text reader based on that framework.

* Adding the schema made clear that passing options as a long list of constructor arguments was not sustainable. Refactored code to use a builder pattern instead.

* Added support for default values in the "null column loader", which required adding a "setValue" method to the column accessors.

* Added unit tests for all new or changed functionality. See TestCsvWithSchema for the overall test of the entire integrated mechanism.

* Added tests for explicit projection with schema

* Better handling of date/time in column accessors

* Converted recent column metadata work from Java 8 date/time to Joda.

* Added more CSV-with-schema unit tests

* Removed the ID fields from "resolved columns", used "instanceof" instead.

* Added wildcard projection with an output schema. Handles both "lenient" and "strict" schemas.

* Tagged projection columns with their output schema, when available.

* Scan projection added modes for wildcard with an output schema. The reader projection added support for merging reader and output schemas.

* Includes refactoring of scan operator tests (the test file grew too large.)

* Renamed some classes to avoid confusing reader schemas with output schemas.

* Added unit tests for the new functionality.

* Added "lenient" wildcard with schema test for CSV

* Added more type conversions: string-to-bit, many-to-string

* Fixed bug in column writer for VarDecimal

* Added missing unit tests, and fixed bugs, in Bit column reader/writer

* Cleaned up a number of unneded "SuppressWarnings"

closes #1711

    • -1
    • +0
    ./java/org/apache/drill/PlanningBase.java
    • -1
    • +1
    ./java/org/apache/drill/TestSchemaCommands.java
  1. … 210 more files in changeset.
DRILL-7106: Fix Intellij warning for FieldSchemaNegotiator

closes #1698

  1. … 4 more files in changeset.
DRILL-7095: Expose table schema (TupleMetadata) to physical operator (EasySubScan)

1. Add system / session option store.table.use_schema_file to control if file schema can be used during query execution. False by default.

2. Added methods in StoragePlugin interface which allow to create Group Scan with provided table schema.

3. EasyGroupScan and EasySubScan now contain table schema, also they are able to serialize / deserialize it along with other scan properties.

4. DrillTable which is the main entry point for schema provisioning, has method to store schema and later uses it to create physical scan.

5. WorkspaceSchema when returning Drill table instance will get table schema from table root if available and if store.table.use_schema_file is set to true.

This PR is the next step for Schema Provisioning project which currently exposes schema only for text reader.

closes #1696

    • -0
    • +2
    ./java/org/apache/drill/PlanningBase.java
  1. … 15 more files in changeset.
DRILL-7103: Incorrect conversion to integer in BsonRecordReader#writeTimeStamp

closes #1695

  1. … 1 more file in changeset.
DRILL-7100: Fixed IllegalArgumentException when reading Parquet data

  1. … 4 more files in changeset.