Clone Tools
  • last updated 11 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7200: Update Calcite to 1.19.0 / 1.20.0

    • -1
    • +2
    ./exec/fn/hive/TestInbuiltHiveUDFs.java
  1. … 46 more files in changeset.
DRILL-6977: Improve Hive tests configuration

1. HiveTestBase data initialization moved to static block

to be initialized once for all derivatives.

2. Extracted Hive driver and storage plugin management from HiveTestDataGenerator

to HiveTestFixture class. This increased cohesion of generator and

added loose coupling between hive test configuration and data generation

tasks.

3. Replaced usage of Guava ImmutableLists with TestBaseViewSupport

helper methods by using standard JDK collections.

closes #1613

    • -0
    • +295
    ./exec/hive/HiveTestFixture.java
    • -16
    • +18
    ./exec/hive/TestHiveStorage.java
    • -23
    • +24
    ./exec/sql/hive/TestViewSupportOnHiveTables.java
    • -122
    • +15
    ./exec/store/hive/HiveTestDataGenerator.java
  1. … 2 more files in changeset.
DRILL-4456: Add Hive translate UDF

closes #1527

    • -0
    • +13
    ./exec/fn/hive/TestInbuiltHiveUDFs.java
  1. … 1 more file in changeset.
DRILL-6744: Support varchar and decimal push down

1. Added enableStringsSignedMinMax parquet format plugin config and store.parquet.reader.strings_signed_min_max session option to control reading binary statistics for files generated by prior versions of Parquet 1.10.0.

2. Added ParquetReaderConfig to store configuration needed during reading parquet statistics or files.

3. Provided mechanism to enable varchar / decimal filter push down.

4. Added VersionUtil to compare Drill versions in string representation.

5. Added appropriate unit tests.

closes #1537

    • -0
    • +41
    ./exec/TestHiveDrillNativeParquetReader.java
    • -5
    • +5
    ./exec/store/hive/HiveTestDataGenerator.java
  1. … 40 more files in changeset.
DRILL-540: Allow querying hive views in Drill

1. Added DrillHiveViewTable which allows construction of DrillViewTable based

on Hive metadata

2. Added initialization of DrillHiveViewTable in HiveSchemaFactory

3. Extracted conversion of Hive data types from DrillHiveTable

to HiveToRelDataTypeConverter

4. Removed throwing of UnsupportedOperationException from HiveStoragePlugin

5. Added TestHiveViewsSupport and authorization tests

6. Added closeSilently() method to AutoCloseables

closes #1559

    • -0
    • +233
    ./exec/hive/TestHiveViewsSupport.java
    • -2
    • +12
    ./exec/hive/TestInfoSchemaOnHiveStorage.java
    • -4
    • +10
    ./exec/store/hive/HiveTestDataGenerator.java
  1. … 9 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

    • -1
    • +1
    ./exec/fn/hive/TestInbuiltHiveUDFs.java
    • -1
    • +1
    ./exec/hive/TestInfoSchemaOnHiveStorage.java
    • -2
    • +2
    ./exec/store/hive/HiveTestDataGenerator.java
  1. … 976 more files in changeset.
DRILL-6492: Ensure schema / workspace case insensitivity in Drill

1. StoragePluginsRegistryImpl was updated:

a. for backward compatibility at init to convert all existing storage plugins names to lower case, in case of duplicates, to log warning and skip the duplicate.

b. to wrap persistent plugins registry into case insensitive store wrapper (CaseInsensitivePersistentStore) to ensure all given keys are converted into lower case when performing insert, update, delete, search operations.

c. to load system storage plugins dynamically by @SystemStorage annotation.

2. StoragePlugins class was updated to stored storage plugins configs by name in case insensitive map.

3. SchemaUtilities.searchSchemaTree method was updated to convert all schema names into lower case to ensure that are they are matched case insensitively (all schemas are stored in Drill in lower case).

4. FileSystemConfig was updated to store workspaces by name in case insensitive hash map.

5. All plugins schema factories are now extend AbstractSchemaFactory to ensure that given schema name is converted to lower case.

6. New method areTableNamesAreCaseInsensitive was added to AbstractSchema to indicate if schema tables names are case insensitive. By default, false. Schema implementation is responsible for table names case insensitive search in case it supports one. Currently, information_schema, sys and hive do so.

7. System storage plugins (information_schema, sys) were refactored to ensure their schema, table names are case insensitive, also the annotation @SystemPlugin and additional constructor were added to allow dynamically load system plugins at storage plugin registry during init phase.

8. MetadataProvider was updated to concert all schema filter conditions into lower case to ensure schema would be matched case insensitively.

9. ShowSchemasHandler, ShowTablesHandler, DescribeTableHandler were updated to ensure schema / tables names (this depends if schema supports case insensitive table names) would be found case insensitively.

git closes #1439

    • -1
    • +1
    ./exec/hive/TestInfoSchemaOnHiveStorage.java
  1. … 51 more files in changeset.
DRILL-6656: Disallow extra semicolons and multiple statements on the same line.

closes #1415

    • -2
    • +4
    ./exec/fn/hive/TestSampleHiveUDFs.java
  1. … 144 more files in changeset.
DRILL-6575: Add store.hive.conf.properties option to allow set Hive properties at session level

closes #1365

    • -0
    • +17
    ./exec/TestHiveDrillNativeParquetReader.java
    • -0
    • +2
    ./exec/hive/TestInfoSchemaOnHiveStorage.java
    • -35
    • +45
    ./exec/store/hive/HiveTestDataGenerator.java
  1. … 18 more files in changeset.
DRILL-6454: Native MapR DB plugin support for Hive MapR-DB json table

closes #1314

    • -2
    • +2
    ./exec/TestHiveDrillNativeParquetReader.java
  1. … 16 more files in changeset.
DRILL-6438: Remove excess logging form the tests. - Removed usages of System.out and System.err from the test and replaced with loggers

closes #1284

    • -3
    • +2
    ./exec/test/Drill2130StorageHiveCoreHamcrestConfigurationTest.java
  1. … 89 more files in changeset.
DRILL-6242 Use java.time.Local{Date|Time|DateTime} for Drill Date, Time, Timestamp types. (#3)

close apache/drill#1247

* DRILL-6242 - Use java.time.Local{Date|Time|DateTime} classes to hold values from corresponding Drill date, time, and timestamp types.

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/ExtendedJsonOutput.java

Fix merge conflicts and check style.

    • -11
    • +9
    ./exec/TestHiveDrillNativeParquetReader.java
    • -7
    • +8
    ./exec/fn/hive/TestInbuiltHiveUDFs.java
    • -19
    • +18
    ./exec/hive/TestHiveStorage.java
  1. … 44 more files in changeset.
DRILL-6173: Support transitive closure during filter push down and partition pruning

closes #1216

    • -0
    • +15
    ./exec/TestHivePartitionPruning.java
  1. … 35 more files in changeset.
DRILL-6320: Fixed license headers.

closes #1207

    • -1
    • +1
    ./exec/fn/hive/TestSampleHiveUDFs.java
    • -1
    • +1
    ./exec/hive/TestInfoSchemaOnHiveStorage.java
    • -1
    • +0
    ./exec/store/hive/HiveTestDataGenerator.java
  1. … 2052 more files in changeset.
DRILL-6331: Revisit Hive Drill native parquet implementation to be exposed to Drill optimizations (filter / limit push down, count to direct scan)

1. Factored out common logic for Drill parquet reader and Hive Drill native parquet readers: AbstractParquetGroupScan, AbstractParquetRowGroupScan, AbstractParquetScanBatchCreator.

2. Rules that worked previously only with ParquetGroupScan, now can be applied for any class that extends AbstractParquetGroupScan: DrillFilterItemStarReWriterRule, ParquetPruneScanRule, PruneScanRule.

3. Hive populated partition values based on information returned from Hive metastore. Drill populates partition values based on path difference between selection root and actual file path.

Before ColumnExplorer populated partition values based on Drill approach. Since now ColumnExplorer populates values for parquet files from Hive tables,

`populateImplicitColumns` method logic was changed to populated partition columns only based on given partition values.

4. Refactored ParquetPartitionDescriptor to be responsible for populating partition values rather than storing this logic in parquet group scan class.

5. Metadata class was moved to separate metadata package (org.apache.drill.exec.store.parquet.metadata). Factored out several inner classed to improve code readability.

6. Collected all Drill native parquet reader unit tests into one class TestHiveDrillNativeParquetReader, also added new tests to cover new functionality.

7. Reduced excessive logging when parquet files metadata is read

closes #1214

    • -0
    • +248
    ./exec/TestHiveDrillNativeParquetReader.java
    • -19
    • +0
    ./exec/TestHivePartitionPruning.java
    • -13
    • +0
    ./exec/TestHiveProjectPushDown.java
    • -189
    • +26
    ./exec/hive/TestHiveStorage.java
    • -2
    • +4
    ./exec/hive/TestInfoSchemaOnHiveStorage.java
    • -21
    • +61
    ./exec/store/hive/HiveTestDataGenerator.java
  1. … 57 more files in changeset.
DRILL-6130: Fix NPE during physical plan submission for various storage plugins

1. Fixed ser / de issues for Hive, Kafka, Hbase plugins.

2. Added physical plan submission unit test for all storage plugins in contrib module.

3. Refactoring.

closes #1108

  1. … 26 more files in changeset.
DRILL-6106: Use valueOf method instead of constructor since valueOf has a higher performance by caching frequently requested values.

closes #1099

  1. … 11 more files in changeset.
DRILL-5730: Mock testing improvements and interface improvements

closes #1045

  1. … 223 more files in changeset.
DRILL-5989: Categories some tests to speed up smoke tests. Made travis run tests.

closes #1053

  1. … 31 more files in changeset.
DRILL-5978: Updating of Apache and MapR Hive libraries to 2.3.2 and 2.1.2-mapr-1710 versions respectively

* Improvements to allow of reading Hive bucketed transactional ORC tables;

* Updating hive properties for tests and resolving dependencies and API conflicts:

- Fix for "hive.metastore.schema.verification", MetaException(message: Version information

not found in metastore) https://cwiki.apache.org/confluence/display/Hive/Hive+Schema+Tool

METASTORE_SCHEMA_VERIFICATION="false" property is added

- Added METASTORE_AUTO_CREATE_ALL="true", properties to tests, because some additional

tables are necessary in Hive metastore

- Disabling calcite CBO for (Hive's CalcitePlanner) for tests, because it is in conflict

with Drill's Calcite version for Drill unit tests. HIVE_CBO_ENABLED="false" property

- jackson and parquet libraries are relocated in hive-exec-shade module

- org.apache.parquet:parquet-column Drill version is added to "hive-exec" to

allow of using Parquet empty group on MessageType level (PARQUET-278)

- Removing of commons-codec exclusion from hive core. This dependency is

necessary for hive-exec and hive-metastore.

- Setting Hive internal properties for transactional scan:

HiveConf.HIVE_TRANSACTIONAL_TABLE_SCAN and for schema evolution: HiveConf.HIVE_SCHEMA_EVOLUTION,

IOConstants.SCHEMA_EVOLUTION_COLUMNS, IOConstants.SCHEMA_EVOLUTION_COLUMNS_TYPES

- "io.dropwizard.metrics:metrics-core" with last 4.0.2 version is added to dependencyManagement block in Drill root POM

- Exclusion of "hive-exec" in "hive-hbase-handler" is already in Drill root dependencyManagement POM

- Hive Calcite libraries are excluded (Calcite CBO was disabled)

- "jackson-core" dependency is added to DependencyManagement block in Drill root POM file

- For MapR Hive 2.1 client older "com.fasterxml.jackson.core:jackson-databind" is included

- "log4j:log4j" dependency is excluded from "hive-exec", "hive-metastore", "hive-hbase-handler".

close apache/drill#1111

    • -0
    • +3
    ./exec/store/hive/HiveTestDataGenerator.java
  1. … 10 more files in changeset.
DRILL-5783, DRILL-5841, DRILL-5894: Rationalize test temp directories

This change includes:

DRILL-5783:

- A unit test is created for the priority queue in the TopN operator.

- The code generation classes passed around a completely unused function registry reference in some places so it is removed.

- The priority queue had unused parameters for some of its methods so it is removed.

DRILL-5841:

- Created standardized temp directory classes DirTestWatcher, SubDirTestWatcher, and BaseDirTestWatcher. And updated all unit tests to use them.

DRILL-5894:

- Removed the dfs_test storage plugin for tests and replaced it with the already existing dfs storage plugin.

Misc:

- General code cleanup.

- Removed unnecessary use of String.format in the tests.

This closes #984

    • -2
    • +2
    ./exec/fn/hive/TestInbuiltHiveUDFs.java
    • -4
    • +1
    ./exec/hive/TestInfoSchemaOnHiveStorage.java
    • -9
    • +37
    ./exec/store/hive/HiveTestDataGenerator.java
  1. … 358 more files in changeset.
DRILL-5941: Skip header / footer improvements for Hive storage plugin

Overview:

1. When table has header / footer process input splits fo the same file in one reader (bug fix for DRILL-5941).

2. Apply skip header logic during reader initialization only once to avoid checks during reading the data (DRILL-5106).

3. Apply skip footer logic only when footer is more then 0, otherwise default processing will be done without buffering data in queue (DRILL-5106).

Code changes:

1. AbstractReadersInitializer was introduced to factor out common logic during readers intialization.

It will have two implementations:

a. Default (each input split group gets its own reader);

b. Empty (for empty tables);

2. AbstractRecordsInspector was introduced to improve performance when table has footer is less or equals to 0.

It will have two implementations:

a. Default (records will be processed one by one without buffering);

b. SkipFooter (queue will be used to buffer N records that should be skipped in the end of file processing).

3. When text table has header / footer each table file should be read as one unit. When file is being read as several input splits, they should be grouped.

For this purpose LogicalInputSplit class was introduced which replaced InputSplitWrapper class. New class stores list of grouped input splits and returns information about splits on group level.

Please note, during planning input splits are grouped only when data is being read from text table has header / footer each table, otherwise each input split is treated separately.

4. Allow HiveAbstractReader to have multiple input splits instead of one.

This closes #1030

    • -12
    • +22
    ./exec/hive/TestHiveStorage.java
    • -6
    • +8
    ./exec/hive/TestInfoSchemaOnHiveStorage.java
    • -27
    • +47
    ./exec/store/hive/HiveTestDataGenerator.java
  1. … 18 more files in changeset.
DRILL-3993: Fix unit test failures connected with support Calcite 1.13

- Use root schema as default for describe table statement.

Fix TestOpenTSDBPlugin.testDescribe() and TestInfoSchemaOnHiveStorage.varCharMaxLengthAndDecimalPrecisionInInfoSchema() unit tests.

- Modify expected results for tests:

TestPreparedStatementProvider.invalidQueryValidationError();

TestProjectPushDown.testTPCH1();

TestProjectPushDown.testTPCH3();

TestStorageBasedHiveAuthorization.selectUser1_db_u0_only();

TestStorageBasedHiveAuthorization.selectUser0_db_u1g1_only()

- Fix TestCTAS.whenTableQueryColumnHasStarAndTableFiledListIsSpecified(), TestViewSupport.createViewWhenViewQueryColumnHasStarAndViewFiledListIsSpecified(), TestInbuiltHiveUDFs.testIf(), testDisableUtf8SupportInQueryString unit tests.

- Fix UnsupportedOperationException and NPE for jdbc tests.

- Fix AssertionError: Conversion to relational algebra failed to preserve datatypes

*DrillCompoundIdentifier:

According to the changes, made in [CALCITE-546], star Identifier is replaced by empty string during parsing the query. Since Drill uses its own DrillCompoundIdentifier, it should also replace star by empty string before creating SqlIdentifier instance to avoid further errors connected with star column. see SqlIdentifier.isStar() method.

*SqlConverter:

In [CALCITE-1417] added simplification of expressions which should be projected every time when a new project rel node is created using RelBuilder. It causes assertion errors connected with types nullability. This hook was set to false to avoid project expressions simplification. See usage of this hook and RelBuilder.project() method.

In Drill the type nullability of the function depends on only the nullability of its arguments. In some cases, a function may return null value even if it had non-nullable arguments. When Calice simplifies expressions, it checks that the type of the result is the same as the type of the expression. Otherwise, makeCast() method is called. But when a function returns null literal, this cast does nothing, even when the function has a non-nullable type. So to avoid this issue, method makeCast() was overridden.

*DrillAvgVarianceConvertlet:

Problem with sum0 and specific changes in old Calcite (it is CALCITE-777). (see HistogramShuttle.visitCall method) Changes were made to avoid changes in Calcite.

*SqlConverter, DescribeTableHandler, ShowTablesHandler:

New Calcite tries to combine both default and specified workspaces during the query validation. In some cases, for example, when describe table statement is used, Calcite tries to find INFORMATION_SCHEMA in the schema used as default. When it does not find the schema, it tries to find a table with such name. For some storage plugins, such as opentsdb and hbase, when a table was not found, the error is thrown, and the query fails. To avoid this issue, default schema was changed to root schema for validation stage for describe table and show tables queries.

    • -1
    • +1
    ./exec/fn/hive/TestInbuiltHiveUDFs.java
  1. … 16 more files in changeset.
DRILL-5832: Change OperatorFixture to use system option manager

- Rename FixtureBuilder to ClusterFixtureBuilder

- Provide alternative way to reset system/session options

- Fix for DRILL-5833: random failure in TestParquetWriter

- Provide strict, but clear, errors for missing options

closes #970

  1. … 51 more files in changeset.
DRILL-5772: Enable UTF-8 support in query string by default

1. Bump up Drill Calcite version to in include CALCITE-2014 changes.

2. Add saffron.properties file to the Drill conf folder.

3. Add appopriate unit tests.

closes #936

  1. … 6 more files in changeset.
DRILL-5752 this change includes:

1. Increased test parallelism and fixed associated bugs

2. Added test categories and categorized tests appropriately

- Don't exclude anything by default

- Increase test timeout

- Fixed flakey test

closes #940

    • -0
    • +4
    ./exec/fn/hive/TestInbuiltHiveUDFs.java
    • -0
    • +4
    ./exec/fn/hive/TestSampleHiveUDFs.java
    • -0
    • +4
    ./exec/hive/TestInfoSchemaOnHiveStorage.java
    • -1
    • +1
    ./exec/test/Drill2130StorageHiveCoreHamcrestConfigurationTest.java
  1. … 254 more files in changeset.
DRILL-5002: Using hive's date functions on top of date column gives wrong results for local time-zone

closes #937

    • -1
    • +42
    ./exec/fn/hive/TestInbuiltHiveUDFs.java
  1. … 2 more files in changeset.
DRILL-5723: Added System Internal Options That can be Modified at Runtime Changes include:

1. Addition of internal options.

2. Refactoring of OptionManagers and OptionValidators.

3. Fixed ambiguity in the meaning of an option type, and changed its name to accessibleScopes.

4. Updated javadocs in the Option System classes.

5. Added RestClientFixture for testing the Rest API.

6. Fixed flakey test in TestExceptionInjection caused by race condition.

7. Fixed various tests which started zookeeper but failed to shut it down at the end of tests.

8. Added port hunting to the Drill Webserver for testing

9. Fixed various flaky tests

10. Fix compile issue

closes #923

    • -1
    • +1
    ./exec/fn/hive/TestInbuiltHiveUDFs.java
  1. … 84 more files in changeset.
DRILL-3250: Drill fails to compare multi-byte characters from hive table - A small refactoring of original fix of this issue (DRILL-4039); - Added test for the fix.

  1. … 3 more files in changeset.
DRILL-5459: Extend physical operator test framework to test mini plans consisting of multiple operators.

This closes #823

    • -2
    • +2
    ./exec/store/hive/HiveTestDataGenerator.java
  1. … 16 more files in changeset.