Clone Tools
  • last updated 29 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7620: Fix plugin mutability issues

A recent commit made the plugin registry more strict about

the rule that, once a plugin is registered, it must be

immutable. A flaw enforcing that rule in the UI put the

registry in an inconsistent state.

Also

* Registry-specific errors

* Push more operations from UI layer into registry

* Clean up semantics of "resolve" for plugins

* Add more unit tests

* Better handling of "bad" plugins

* Force plugin names to lower case

* Fix comparison bugs in some format plugins

  1. … 101 more files in changeset.
DRILL-7196: Queries are still runnable on disabled plugins

- Storage client is not created anymore for disabled plugins

- GET "/storage/{name}.json" endpoint now working with

plugin configuration directly, without client instantination.

It have increased UI responsitivity.

- Hbase and mongo base test classes refactored to honor enabled

plugin attribute

- Fixed path contructor for mongo test datasets:

Now it is cross-platform

- Fixed test json files format which using plugin definitions

- Code cleanup

  1. … 106 more files in changeset.
DRILL-7115: Improve Hive schema show tables performance

1. To make SHOW TABLES for Hive schema work much faster, additional Drill

feature of showing only accesible tables when Storage-Based authorization

is enabled was sacrificed. Now the behaviour matches to Hive/Beeline, all

tables will be shown despite of accessibility. For details about previous

show tables results, check description of DRILL-540.

2. In HiveDatabaseSchema implemented faster getTableNamesAndTypes() method

and removed bulk related code.

3. Deprecated bulk related options and removed bulk code from AbstractSchema,

DrillHiveMetastoreClient.

4. For 8000 Hive tables query returned in 1.8 seconds, for combination of

4000 tables and 8000 views query returned in 2.3 seconds. Note, that

after first query table names will be cached and next queries will perform

in less than 1 sec.

5. Refactored WorkspaceSchemaFactory's getTableNamesAndTypes()

method to reuse existing getViews() method.

6. DrillHiveMetastoreClient was refactored. Classes were unnested and enclosed

within client package with restricted visibility. Also was updated cache

values type to avoid unnecessarry List to Set back and forth conversions.

Client creation methods moved to separate class. So the new package

exposes only factory and client class.

closes #1706

  1. … 19 more files in changeset.
DRILL-6944: UnsupportedOperationException thrown for view over MapR-DB binary table

1. Added persistence of MAP key and value types in Drill views (affects .view.drill file) for avoiding cast problems in future.

2. Preserved backward compatibility of older view files by treating untyped maps as ANY.

closes #1602

  1. … 5 more files in changeset.
DRILL-6850: Force setting DRILL_LOGICAL Convention for DrillRelFactories and DrillFilterRel

- Fix workspace case insensitivity for JDBC storage plugin

  1. … 13 more files in changeset.
DRILL-540: Allow querying hive views in Drill

1. Added DrillHiveViewTable which allows construction of DrillViewTable based

on Hive metadata

2. Added initialization of DrillHiveViewTable in HiveSchemaFactory

3. Extracted conversion of Hive data types from DrillHiveTable

to HiveToRelDataTypeConverter

4. Removed throwing of UnsupportedOperationException from HiveStoragePlugin

5. Added TestHiveViewsSupport and authorization tests

6. Added closeSilently() method to AutoCloseables

closes #1559

  1. … 13 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

  1. … 982 more files in changeset.
DRILL-6492: Ensure schema / workspace case insensitivity in Drill

1. StoragePluginsRegistryImpl was updated:

a. for backward compatibility at init to convert all existing storage plugins names to lower case, in case of duplicates, to log warning and skip the duplicate.

b. to wrap persistent plugins registry into case insensitive store wrapper (CaseInsensitivePersistentStore) to ensure all given keys are converted into lower case when performing insert, update, delete, search operations.

c. to load system storage plugins dynamically by @SystemStorage annotation.

2. StoragePlugins class was updated to stored storage plugins configs by name in case insensitive map.

3. SchemaUtilities.searchSchemaTree method was updated to convert all schema names into lower case to ensure that are they are matched case insensitively (all schemas are stored in Drill in lower case).

4. FileSystemConfig was updated to store workspaces by name in case insensitive hash map.

5. All plugins schema factories are now extend AbstractSchemaFactory to ensure that given schema name is converted to lower case.

6. New method areTableNamesAreCaseInsensitive was added to AbstractSchema to indicate if schema tables names are case insensitive. By default, false. Schema implementation is responsible for table names case insensitive search in case it supports one. Currently, information_schema, sys and hive do so.

7. System storage plugins (information_schema, sys) were refactored to ensure their schema, table names are case insensitive, also the annotation @SystemPlugin and additional constructor were added to allow dynamically load system plugins at storage plugin registry during init phase.

8. MetadataProvider was updated to concert all schema filter conditions into lower case to ensure schema would be matched case insensitively.

9. ShowSchemasHandler, ShowTablesHandler, DescribeTableHandler were updated to ensure schema / tables names (this depends if schema supports case insensitive table names) would be found case insensitively.

git closes #1439

  1. … 54 more files in changeset.
DRILL-6494: Drill Plugins Handler

- Storage Plugins Handler service is used op the Drill start-up stage and it updates storage plugins configs from

storage-plugins-override.conf file. If plugins configs are present in the persistence store - they are updated,

otherwise bootstrap plugins are updated and the result configs are loaded to persistence store. If the enabled

status is absent in the storage-plugins-override.conf file, the last plugin config enabled status persists.

- 'drill.exec.storage.action_on_plugins_override_file' Boot option is added. This is the action, which should be

performed on the storage-plugins-override.conf file after successful updating storage plugins configs.

Possible values are: "none" (default), "rename" and "remove".

- The "NULL" issue with updating Hive plugin config by REST is solved. But clients are still being instantiated for disabled

plugins - DRILL-6412.

- "org.honton.chas.hocon:jackson-dataformat-hocon" library is added for the proper deserializing HOCON conf file

- additional refactoring: "com.typesafe:config" and "org.apache.commons:commons-lang3" are placed into DependencyManagement

block with proper versions; correct properties for metrics in "drill-override-example.conf" are specified

closes #1345

  1. … 34 more files in changeset.
DRILL-6386: Remove unused imports and star imports.

  1. … 231 more files in changeset.
DRILL-6320: Fixed license headers.

closes #1207

  1. … 2066 more files in changeset.
DRILL-3993: Changes to support Calcite 1.15.

Fix AssertionError: type mismatch for tests with aggregate functions.

Fix VARIANCE agg function

Remove using deprecated Subtype enum

Fix 'Failure while loading table a in database hbase' error

Fix 'Field ordinal 1 is invalid for type '(DrillRecordRow[*])'' unit test failures

  1. … 17 more files in changeset.
DRILL-3250: Drill fails to compare multi-byte characters from hive table - A small refactoring of original fix of this issue (DRILL-4039); - Added test for the fix.

  1. … 3 more files in changeset.
DRILL-5496: Fix for failed Hive connection

If the Hive server restarts, Drill either hangs or continually reports

errors when retrieving schemas. The problem is that the Hive plugin

tries to handle connection failures, but does not do so correctly for

the secure connection case. The problem is complex, see DRILL-5496 for

details.

This is a workaround: we discard the entire Hive schema cache when we

encounter an unhandled connection exception, then we rebuild a new one.

This is not a proper fix; for that we'd have to restructure the code.

This will, however, solve the immediate problem until we do the needed

restructuring.

  1. … 5 more files in changeset.
DRILL-4039: Query fails when non-ascii characters are used in string literals

closes #825

  1. … 1 more file in changeset.
DRILL-5032: Drill query on hive parquet table failed with OutOfMemoryError: Java heap space

close apache/drill#654

  1. … 22 more files in changeset.
DRILL-4826: Query against INFORMATION_SCHEMA.TABLES degrades as the number of views increases

This closes #592

  1. … 8 more files in changeset.
DRILL-4768: Fix leaking hive meta store connection in Drill's hive metastore client call.

- do not call reconnect if the connection is still alive and the error is caused by either UnknownTableException or access error.

- call close() explicitly before reconnect() and check if client.close() will hit exception.

- make DrillHiveMetaStoreClient closable.

close apache/drill#543

  1. … 1 more file in changeset.
DRILL-4577: Construct a specific path for querying all the tables from a hive database

  1. … 9 more files in changeset.
DRILL-3745: Hive CHAR not supported

  1. … 11 more files in changeset.
DRILL-4256: Create HiveConf per HiveStoragePlugin and reuse it wherever needed.

Creating new instances of HiveConf() are very costly, we should avoid creating new ones as much as possible.

Also get rid of hiveConfigOverride and use HiveConf in HiveStoregPlugin wherever we need the HiveConf.

  1. … 13 more files in changeset.
DRILL-4126: Enable caching for HiveMetaStore access when impersonation is enabled. Change table cache in metastore client.

1) HiveSchemaFactory maintain one metastoreClientWithAuth per user.

2) Put the hive metastore object caches in DrillHiveMetaStoreClient.

3) Use LoadingCache to store the set of clients for impersonation.

4) Use flat level cache for tables in DrillHiveMetastoreClient.

  1. … 1 more file in changeset.
DRILL-4127: Reduce Hive metastore client API call in HiveSchema.

1) Use lazy loading of tableNames in HiveSchema, in stead of pre-loading all table names under each HiveSchema.

2) Do not call get_all_databases for subSchema to check existence if the name comes from getSubSchemaNames() directly.

review comments.

DRILL-3209: Add support for reading Hive parquet tables using Drill native parquet reader

  1. … 15 more files in changeset.
DRILL-3413: When SASL is enabled use DIGEST mechanism in creating HiveMetaStoreClient for proxy users.

  1. … 1 more file in changeset.
DRILL-3203: Add support for impersonation in Hive storage plugin

  1. … 19 more files in changeset.
DRILL-3263: Read Smallint and Tinyint columns in Hive tables as Integer.

Smallint and Tinyint are not fully implemented, this will be addressed when DRILL-2470 is fixed. Until these types are ready for full use throughout Drill, we will be reading smallint and tinyint data as integers, as we have much more thorough support and testing for the integer type.

Disabled unit tests for Hive functions that take tinyint and smallint as input or produce them as output.

  1. … 5 more files in changeset.
DRILL-2514: Add support for impersonation in FileSystem storage plugin.

  1. … 70 more files in changeset.
DRILL-2413: FileSystemPlugin refactoring: avoid sharing DrillFileSystem across schemas

  1. … 33 more files in changeset.
DRILL-2173: Partition queries to drive dynamic pruning

Adds new interface on the QueryContext as well as individual schemas for exploring partitions of tables.

Adds injectable type for partition explorer for use in UDFs. This is hooked into both to expression

materialization and interpreted evaluation. The FragmentContext throws an exception to tell users to turn on

constant folding if a UDF that uses the PartitionExplorer makes it past planning.

2173 update -Address Chris' review comments.

Change the PartitionExplorer to return an Iterable<String> instead of String[]

Add interface level description to PartitionExplorer and StoragePluginPartitionExplorer.

New inner class in FileSystemPlugin to fulfill the new Iterable interface for partitions.

Formatting/cleanup fixes

Clean up error reporting code in MaxDir UDF. Remove method to get a string from a DrillBuf, as it was already defined in StringFunctionHelpers. Add new utility method to specifically convert a VarCharHolder to a string to remove some boilerplate.

Fixed an errant copy paste in a comment and removed unused imports.

Fix docs in FileSystemPlugin, belongs with the 2173 changes.

Fix references in Javadoc to properly use @link instead of @see.

2173 fixes, correctly return an empty list of sub-partitions if the path requested in the partition explorer interface is a file. Fix a few docs.

More 2173, finishing Chris' comments

2173 update - Add validation for PartitionExplorer injectable in UdfUtiltiers.

small change to fix refactored unit tests.

cleanup for 2173

Fix maxdir UDF so it can compile in runtime generated code as well as the interpreted expression system (needed to fully qualify classes and interfaces). It still fails to execute, as we prevent requesting a schema from a non-root fragment. We do not expect these types of functions to ever be used without constant folding so this should not be an issue.

Update error message in the case where the partition explorer is being used outside of planning.

Adding free marker generated maxdir, imaxdir, mindir and imindir

remove import that violates build checks, fix typo in new test class name

Separate out SubDirectoryList from FileSystemSchemaFactory.

Fix unit test to correctly test all four functions.

Update partition explorer to take List instead of Collection. As the lists are used in parallel it should be explicit that these are expected to be ordered (which Collections do not guarantee).

Drop the extra file generated due to the header in the free marker template and fix a typo and remove an unused import.

  1. … 19 more files in changeset.