Clone Tools
  • last updated 11 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7254: Read Hive union w/o nulls

    • -0
    • +3
    ./src/test/resources/complex_types/array/union_array.txt
    • binary
    ./src/test/resources/complex_types/map/map_union_tbl.avro
  1. … 11 more files in changeset.
DRILL-7387: Failed to get value by int key from map nested into struct

  1. … 1 more file in changeset.
DRILL-7380: Query of a field inside of an array of structs returns null

1. Fixed parquet reader projection for Logical lists (DrillParquetReader.java)

2. Fixed projection pushdown for RexFieldAccess (ProjectFieldsVisitor.java)

3. DrillParquetReader.getProjection(...) splitted into few methods

4. Added javadocs for PathSegment and SchemaPath

  1. … 4 more files in changeset.
DRILL-7357: Expose Drill Metastore data through information_schema

1. Add additional columns to TABLES and COLUMNS tables.

2. Add PARTITIONS table.

3. General refactoring to adjust information_schema data retrieval from multiple sources.

closes #1860

  1. … 33 more files in changeset.
DRILL-7376: Drill ignores Hive schema for MaprDB tables when group scan has star column

  1. … 3 more files in changeset.
DRILL-7252: Read Hive map using Dict<K,V> vector

    • -0
    • +3
    ./src/test/resources/complex_types/array/map_array.json
    • -0
    • +3
    ./src/test/resources/complex_types/map/map_complex_tbl.json
  1. … 5 more files in changeset.
DRILL-4517: Support reading empty Parquet files

1. Modified flat and complex parquet readers to output schema only when requested number of records to read is 0. In this case readers are not initialized to improve performance.

2. Allowed reading requested number of rows instead of all rows in the row group (DRILL-6528).

3. Fixed issue with nulls number determination in the row group (fixed IsPredicate#isAllNulls method).

4. Allowed reading empty parquet files via adding empty / fake row group.

5. General refactoring and unit tests.

6. Parquet tests categorization.

closes #1839

    • binary
    ./src/test/resources/empty.parquet
  1. … 44 more files in changeset.
DRILL-7337: Add vararg UDFs support

  1. … 35 more files in changeset.
DRILL-7314: Use TupleMetadata instead of concrete implementation

1. Add ser / de implementation for TupleMetadata interface based on types.

2. Replace TupleSchema usage where possible.

3. Move patcher classes into commons.

4. Upgrade some dependencies and general refactoring.

  1. … 39 more files in changeset.
DRILL-7316: Move classes from org.apache.drill.metastore into org.apache.drill.exec.metastore package in java-exec module

  1. … 32 more files in changeset.
DRILL-7315: Revise precision and scale order in the method arguments

  1. … 28 more files in changeset.
DRILL-7307: casthigh for decimal type can lead to the issues with VarDecimalHolder

- Fixed code-gen for VarDecimal type

- Fixed code-gen issue with nullable holders for simple cast functions

with passed constants as arguments.

- Code-gen now honnoring DataType.Optional type defined by UDF for

NULL-IF-NULL functions.

  1. … 9 more files in changeset.
DRILL-7313: Use Hive schema for MaprDB native reader when field was empty

- Added all_text_mode option for hive maprDB Json

- Improved logic to convert Hive's schema into Drill's one

- Added unit tests for schema conversion

  1. … 24 more files in changeset.
DRILL-6711: Use jitpack repository for Drill Calcite project artifacts instead of repository.mapr.com

closes #1815

  1. … 7 more files in changeset.
DRILL-7271: Refactor Metadata interfaces and classes to contain all needed information for the File based Metastore

  1. … 119 more files in changeset.
DRILL-7253: Read Hive struct w/o nulls

    • -0
    • +2
    ./src/test/resources/complex_types/array/struct_array.json
    • -0
    • +3
    ./src/test/resources/complex_types/struct/struct_tbl.json
  1. … 11 more files in changeset.
DRILL-7268: Read Hive array with parquet native reader

1. Fixed preserving of group originalType for projected schema

in DrillParquetReader

2. Added reading of LIST logical type to DrillParquetGroupConverter.

Intermediate noop converter used to skip writing for next nested

repeated field after recognition of parent field as LIST. For this

skipRepeated 'true' passed to child converter's constructor.

close apache/drill#1805

  1. … 5 more files in changeset.
DRILL-7251: Read Hive array w/o nulls

1. HiveFieldConverter replaced by Hive writers for primitives

2. Created HiveValueWriterFactory and HiveListWriter to implement arrays support

4. Readers generation replaced by HiveDefaultRecordReader and HiveTextRecordReader

5. Few reader initializers replaced by one

6. Added method to repeated vardecimal writer

7. Minor fix for array column in View

    • -50
    • +0
    ./src/main/codegen/data/HiveFormats.tdd
  1. … 39 more files in changeset.
DRILL-7196: Queries are still runnable on disabled plugins

- Storage client is not created anymore for disabled plugins

- GET "/storage/{name}.json" endpoint now working with

plugin configuration directly, without client instantination.

It have increased UI responsitivity.

- Hbase and mongo base test classes refactored to honor enabled

plugin attribute

- Fixed path contructor for mongo test datasets:

Now it is cross-platform

- Fixed test json files format which using plugin definitions

- Code cleanup

  1. … 105 more files in changeset.
[maven-release-plugin] prepare for next development iteration

  1. … 31 more files in changeset.
[maven-release-plugin] prepare release drill-1.16.0

  1. … 31 more files in changeset.
[maven-release-plugin] prepare release drill-1.16.0

  1. … 31 more files in changeset.
DRILL-7062: Initial implementation of run-time rowgroup pruning closes #1738

  1. … 23 more files in changeset.
DRILL-7115: Improve Hive schema show tables performance

1. To make SHOW TABLES for Hive schema work much faster, additional Drill

feature of showing only accesible tables when Storage-Based authorization

is enabled was sacrificed. Now the behaviour matches to Hive/Beeline, all

tables will be shown despite of accessibility. For details about previous

show tables results, check description of DRILL-540.

2. In HiveDatabaseSchema implemented faster getTableNamesAndTypes() method

and removed bulk related code.

3. Deprecated bulk related options and removed bulk code from AbstractSchema,

DrillHiveMetastoreClient.

4. For 8000 Hive tables query returned in 1.8 seconds, for combination of

4000 tables and 8000 views query returned in 2.3 seconds. Note, that

after first query table names will be cached and next queries will perform

in less than 1 sec.

5. Refactored WorkspaceSchemaFactory's getTableNamesAndTypes()

method to reuse existing getViews() method.

6. DrillHiveMetastoreClient was refactored. Classes were unnested and enclosed

within client package with restricted visibility. Also was updated cache

values type to avoid unnecessarry List to Set back and forth conversions.

Client creation methods moved to separate class. So the new package

exposes only factory and client class.

closes #1706

  1. … 6 more files in changeset.
DRILL-7089: Implement caching for TableMetadataProvider at query level and adapt statistics to use Drill metastore API

closes #1728

  1. … 49 more files in changeset.
DRILL-2326: Fix scalar replacement for the case when static method which does not return values is called

- Fix check for return function value to handle the case when created object is returned without assigning it to the local variable

closes #1687

  1. … 3 more files in changeset.
DRILL-6852: Adapt current Parquet Metadata cache implementation to use Drill Metastore API

Co-authored-by: Volodymyr Vysotskyi <vvovyk@gmail.com>

Co-authored-by: Vitalii Diravka <vitalii@apache.org>

close apache/drill#1646

  1. … 65 more files in changeset.
DRILL-6927: Avoid double conversion from impala timestamp when hive native parquet reader is used closes #1655

DRILL-7200: Update Calcite to 1.19.0 / 1.20.0

  1. … 45 more files in changeset.
DRILL-5603: Replace String file paths to Hadoop Path - replaced all String path representation with org.apache.hadoop.fs.Path - added PathSerDe.Se JSON serializer - refactoring of DFSPartitionLocation code by leveraging existing listPartitionValues() functionality

closes #1657

  1. … 78 more files in changeset.