Clone Tools
  • last updated a few seconds ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7251: Read Hive array w/o nulls

1. HiveFieldConverter replaced by Hive writers for primitives

2. Created HiveValueWriterFactory and HiveListWriter to implement arrays support

4. Readers generation replaced by HiveDefaultRecordReader and HiveTextRecordReader

5. Few reader initializers replaced by one

6. Added method to repeated vardecimal writer

7. Minor fix for array column in View

    • -0
    • +530
    ./HiveDefaultRecordReader.java
    • -0
    • +125
    ./HiveTextRecordReader.java
    • -0
    • +127
    ./ReadersInitializer.java
    • -78
    • +0
    ./initilializers/AbstractReadersInitializer.java
    • -53
    • +0
    ./initilializers/DefaultReadersInitializer.java
    • -46
    • +0
    ./initilializers/EmptyReadersInitializer.java
    • -87
    • +0
    ./initilializers/ReadersInitializer.java
    • -2
    • +2
    ./inspectors/AbstractRecordsInspector.java
    • -42
    • +0
    ./inspectors/DefaultRecordsInspector.java
  1. … 44 more files in changeset.
DRILL-7019: Add check for redundant imports

close apache/drill#1629

    • -1
    • +0
    ./initilializers/DefaultReadersInitializer.java
  1. … 23 more files in changeset.
DRILL-6793: FragmentExecutor cannot send its final state for the case when RootExec root wasn't initialized

closes #1506

  1. … 5 more files in changeset.
DRILL-6724: Dump operator context to logs when error occurs during query execution

closes #1455

  1. … 102 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

  1. … 984 more files in changeset.
DRILL-6656: Disallow extra semicolons and multiple statements on the same line.

closes #1415

    • -1
    • +1
    ./initilializers/DefaultReadersInitializer.java
  1. … 144 more files in changeset.
DRILL-6473: Update MapR Hive

close apache/drill#1307

  1. … 2 more files in changeset.
DRILL-6386: Remove unused imports and star imports.

    • -2
    • +0
    ./initilializers/EmptyReadersInitializer.java
  1. … 231 more files in changeset.
DRILL-6320: Fixed license headers.

closes #1207

    • -6
    • +7
    ./inspectors/AbstractRecordsInspector.java
    • -6
    • +7
    ./inspectors/DefaultRecordsInspector.java
    • -6
    • +7
    ./inspectors/SkipFooterRecordsInspector.java
  1. … 2064 more files in changeset.
DRILL-6094: Decimal data type enhancements

Add ExprVisitors for VARDECIMAL

Modify writers/readers to support VARDECIMAL

- Added usage of VarDecimal for parquet, hive, maprdb, jdbc;

- Added options to store decimals as int32 and int64 or fixed_len_byte_array or binary;

Add UDFs for VARDECIMAL data type

- modify type inference rules

- remove UDFs for obsolete DECIMAL types

Enable DECIMAL data type by default

Add unit tests for DECIMAL data type

Fix mapping for NLJ when literal with non-primitive type is used in join conditions

Refresh protobuf C++ source files

Changes in C++ files

Add support for decimal logical type in Avro.

Add support for date, time and timestamp logical types.

Update Avro version to 1.8.2.

  1. … 201 more files in changeset.
DRILL-6195: Quering Hive non-partitioned transactional tables via Drill

closes #1140

DRILL-6164: Heap memory leak during parquet scan and OOM

closes #1122

    • -2
    • +2
    ./initilializers/DefaultReadersInitializer.java
  1. … 15 more files in changeset.
DRILL-5978: Updating of Apache and MapR Hive libraries to 2.3.2 and 2.1.2-mapr-1710 versions respectively

* Improvements to allow of reading Hive bucketed transactional ORC tables;

* Updating hive properties for tests and resolving dependencies and API conflicts:

- Fix for "hive.metastore.schema.verification", MetaException(message: Version information

not found in metastore) https://cwiki.apache.org/confluence/display/Hive/Hive+Schema+Tool

METASTORE_SCHEMA_VERIFICATION="false" property is added

- Added METASTORE_AUTO_CREATE_ALL="true", properties to tests, because some additional

tables are necessary in Hive metastore

- Disabling calcite CBO for (Hive's CalcitePlanner) for tests, because it is in conflict

with Drill's Calcite version for Drill unit tests. HIVE_CBO_ENABLED="false" property

- jackson and parquet libraries are relocated in hive-exec-shade module

- org.apache.parquet:parquet-column Drill version is added to "hive-exec" to

allow of using Parquet empty group on MessageType level (PARQUET-278)

- Removing of commons-codec exclusion from hive core. This dependency is

necessary for hive-exec and hive-metastore.

- Setting Hive internal properties for transactional scan:

HiveConf.HIVE_TRANSACTIONAL_TABLE_SCAN and for schema evolution: HiveConf.HIVE_SCHEMA_EVOLUTION,

IOConstants.SCHEMA_EVOLUTION_COLUMNS, IOConstants.SCHEMA_EVOLUTION_COLUMNS_TYPES

- "io.dropwizard.metrics:metrics-core" with last 4.0.2 version is added to dependencyManagement block in Drill root POM

- Exclusion of "hive-exec" in "hive-hbase-handler" is already in Drill root dependencyManagement POM

- Hive Calcite libraries are excluded (Calcite CBO was disabled)

- "jackson-core" dependency is added to DependencyManagement block in Drill root POM file

- For MapR Hive 2.1 client older "com.fasterxml.jackson.core:jackson-databind" is included

- "log4j:log4j" dependency is excluded from "hive-exec", "hive-metastore", "hive-hbase-handler".

close apache/drill#1111

  1. … 14 more files in changeset.
DRILL-5941: Skip header / footer improvements for Hive storage plugin

Overview:

1. When table has header / footer process input splits fo the same file in one reader (bug fix for DRILL-5941).

2. Apply skip header logic during reader initialization only once to avoid checks during reading the data (DRILL-5106).

3. Apply skip footer logic only when footer is more then 0, otherwise default processing will be done without buffering data in queue (DRILL-5106).

Code changes:

1. AbstractReadersInitializer was introduced to factor out common logic during readers intialization.

It will have two implementations:

a. Default (each input split group gets its own reader);

b. Empty (for empty tables);

2. AbstractRecordsInspector was introduced to improve performance when table has footer is less or equals to 0.

It will have two implementations:

a. Default (records will be processed one by one without buffering);

b. SkipFooter (queue will be used to buffer N records that should be skipped in the end of file processing).

3. When text table has header / footer each table file should be read as one unit. When file is being read as several input splits, they should be grouped.

For this purpose LogicalInputSplit class was introduced which replaced InputSplitWrapper class. New class stores list of grouped input splits and returns information about splits on group level.

Please note, during planning input splits are grouped only when data is being read from text table has header / footer each table, otherwise each input split is treated separately.

4. Allow HiveAbstractReader to have multiple input splits instead of one.

This closes #1030

    • -0
    • +417
    ./HiveAbstractReader.java
    • -0
    • +78
    ./initilializers/AbstractReadersInitializer.java
    • -0
    • +54
    ./initilializers/DefaultReadersInitializer.java
    • -0
    • +48
    ./initilializers/EmptyReadersInitializer.java
    • -0
    • +87
    ./initilializers/ReadersInitializer.java
    • -0
    • +71
    ./inspectors/AbstractRecordsInspector.java
    • -0
    • +41
    ./inspectors/DefaultRecordsInspector.java
    • -0
    • +87
    ./inspectors/SkipFooterRecordsInspector.java
  1. … 14 more files in changeset.