drill-module.conf

Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7437: Storage Plugin for Generic HTTP REST API

  1. … 35 more files in changeset.
DRILL-7675: Work around for partitions sender memory use

Adds an ad-hoc system/session option to limit partition sender

memory use. See DRILL-7686 for the underlying issue.

Also includes code cleanup and diagnostic tools.

closes #2047

  1. … 16 more files in changeset.
DRILL-7650: Add option to enable Jetty's dump for troubleshooting

1. Added option drill.exec.http.jetty.server.dumpAfterStart

2. Removed redundant setProtocol() call

  1. … 4 more files in changeset.
DRILL-7637: Add an option to retrieve MapR SSL truststore/keystore credentials using MapR Web Security Manager

  1. … 9 more files in changeset.
DRILL-7607: support dynamic credit based flow control

closes #2000

  1. … 32 more files in changeset.
DRILL-7592: Add missing licenses and update plugins exclusion list and fix licenses

closes #1989

  1. … 85 more files in changeset.
DRILL-7582: Moved Drillbits REST API communication to the back end layer

closes #1999

  1. … 6 more files in changeset.
DRILL-7565: ANALYZE TABLE ... REFRESH METADATA does not work for empty Parquet files

- Fixed ConvertMetadataAggregateToDirectScanRule rule to distinguish array columns correctly and proceed using other parquet metadata if such columns are found.

- Added new implicit column which signalizes whether the empty result is obtained during collecting metadata and helps to distinguish real data results from metadata results.

- Updated scan to return row with metadata if the above implicit column is present.

- Added unit tests for checking the correctness of both optional and required columns from empty files.

closes #1985

  1. … 13 more files in changeset.
DRILL-7590: Refactor plugin registry

Major cleanup of the plugin registry to split it into components

in preparation for a proper plugin API.

Better coordinates the named and ephemeral plugin caches.

Cleans up the registry API. Sharpens rules for modifying

plugin configs.

closes #1988

  1. … 163 more files in changeset.
DRILL-6832: Removes the old "unmanaged" external sort

When the "managed" external sort was implemented a couple

of years back, we retained the original "unmanaged" version

out of an abundance of caution. The new version is now

battle tested and it is time to retire the original one.

closes #1929

  1. … 64 more files in changeset.
DRILL-7388: Kafka improvements

1. Upgraded Kafka libraries to 2.3.1 (DRILL-6739).

2. Added new options to support the same features as native JSON reader:

a. store.kafka.reader.skip_invalid_records, default: false (DRILL-6723);

b. store.kafka.reader.allow_nan_inf, default: true;

c. store.kafka.reader.allow_escape_any_char, default: false.

3. Fixed issue when Kafka topic contains only one message (DRILL-7388).

4. Replaced Gson parser with Jackson to parse JSON in the same manner as Drill native Json reader.

5. Performance improvements: Kafka consumers will be closed async, fixed issue with resource leak (DRILL-7290), moved to debug unnecessary info logging.

6. Updated bootstrap-storage-plugins.json to reflect actual Kafka connection properties.

7. Added unit tests.

8. Refactoring and code clean up.

closes #1901

  1. … 33 more files in changeset.
DRILL-7402: Suppress batch dumps for expected failures in tests

Drill provides a way to dump the last few batches when an error

occurs. However, in tests, we often deliberately cause something

to fail. In this case, the batch dump is unnecessary.

This enhancement adds a config property, disabled in tests, that

controls the dump activity. The option is enabled in the one test

that needs it enabled.

closes #1872

  1. … 4 more files in changeset.
DRILL-6096: Provide mechanism to configure text writer configuration

1. Usage of format plugin configuration allows to specify line and field delimiters, quotes and escape characters.

2. Usage of system / session options allows to specify if writer should add headers, force quotes.

closes #1873

  1. … 19 more files in changeset.
DRILL-7222: Visualize estimated and actual row counts for a query

With statistics in place, it is useful to have the estimated rowcount along side the actual rowcount query profile's operator overview. A toggle button allows this with the estimated rows hidden by default

We can extract this from the Physical Plan section of the profile.

Added a toggle-ready table-column header

closes #1779

  1. … 5 more files in changeset.
DRILL-7356: Introduce session options for the Drill Metastore

closes #1846

  1. … 2 more files in changeset.
DRILL-7338: REST API calls to Drill fail due to insufficient heap memory

This PR allows for the 85% threshold to be customizable with a value of 0 meant for disabling.

closes #1837

  1. … 2 more files in changeset.
DRILL-7313: Use Hive schema for MaprDB native reader when field was empty

- Added all_text_mode option for hive maprDB Json

- Improved logic to convert Hive's schema into Drill's one

- Added unit tests for schema conversion

  1. … 27 more files in changeset.
DRILL-7273: Introduce operators for handling metadata

closes #1886

  1. … 156 more files in changeset.
DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

This introduces support for JDBC's Statement.setMaxRows(int) API, which can help Drill execute a query much faster if it knows that not ALL the records in the resultset will be consumed upfront.

This Commit introduces the core changes to support the feature within Drill's execution engine

Protobuf Changes

1. RunQuery: Added "autolimit_rowcount"

2. QueryProfile: Added "autoLimit"

3. Regenerated Java and C++ client files

REST API support

1. Support for REST server to interpret a submitted query and also for rendering this information for an executed query

2. Updates to the Freemarker templates (for WebUI)

3. Safety check within Javascript (for WebUI)

JDBC API support

1. Introduces backend execution of 'ALTER SESSION' to apply the auto-limiting of resultset size

2. Added Unit Tests for PreparedStatement and Statement objects

3. Added getter setter methods to be skipped in testing for org.apache.drill.jdbc.test.Drill2489CallsAfterCloseThrowExceptionsTest.testclosedPreparedStmtOfOpenConnMethodsThrowRight()

Updates based on review comments

Additional Updates

Test Cleanup

1. Revert Drill2489 hack

2. Formatting in *StatementTest

3. Removal f redundant `statement.close()`

4. Manage new Exception thrown when setting invalid maxRow values

Final updates

1. Test changes

2. Trim trailing spaces in auto-limit value (Javascript)

3. Before & After annotations to synchronize changes to system values for MaxRows(auto-limit)

Reorganized tests due to synchronized locking

Removed conflicting JsonCreator in QueryWrapper

Additional test cleanup

closes #1714

  1. … 34 more files in changeset.
DRILL-7062: Initial implementation of run-time rowgroup pruning closes #1738

  1. … 24 more files in changeset.
DRILL-7096: Develop vector for canonical Map<K,V>

- Added new type DICT;

- Created value vectors for the type for single and repeated modes;

- Implemented corresponding FieldReaders and FieldWriters;

- Made changes in EvaluationVisitor to be able to read values from the map by key;

- Made changes to DrillParquetGroupConverter to be able to read Parquet's MAP type;

- Added an option `store.parquet.reader.enable_map_support` to disable reading MAP type as DICT from Parquet files;

- Updated AvroRecordReader to use new DICT type for Avro's MAP;

- Added support of the new type to ParquetRecordWriter.

  1. … 108 more files in changeset.
DRILL-7110: Skip writing profile when an ALTER SESSION is executed (#1703)

Allows (by default) for `ALTER SESSION SET <option>=<value>` queries to NOT be writen to the profile store. This would avoid the risk of potentially adding up to a lot of profiles being written unnecessarily, since those changes are also reflected on the queries that follow.

  1. … 5 more files in changeset.
DRILL-7148: Use improved join cardinality and ndv estimation with statistics

closes #1744

  1. … 11 more files in changeset.
DRILL-7095: Expose table schema (TupleMetadata) to physical operator (EasySubScan)

1. Add system / session option store.table.use_schema_file to control if file schema can be used during query execution. False by default.

2. Added methods in StoragePlugin interface which allow to create Group Scan with provided table schema.

3. EasyGroupScan and EasySubScan now contain table schema, also they are able to serialize / deserialize it along with other scan properties.

4. DrillTable which is the main entry point for schema provisioning, has method to store schema and later uses it to create physical scan.

5. WorkspaceSchema when returning Drill table instance will get table schema from table root if available and if store.table.use_schema_file is set to true.

This PR is the next step for Schema Provisioning project which currently exposes schema only for text reader.

closes #1696

  1. … 15 more files in changeset.
DRILL-7060: Support JsonParser Feature 'ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER' (#1663)

  1. … 7 more files in changeset.
DRILL-6952: Host compliant text reader on the row set framework

The result set loader allows controlling batch sizes. The new scan framework

built on top of that framework handles projection, implicit columns, null

columns and more. This commit converts the "new" ("compliant") text reader

to use the new framework. Options select the use of the V2 ("new") or V3

(row-set based) versions. Unit tests demonstrate V3 functionality.

closes #1683

  1. … 58 more files in changeset.
DRILL-7117: Support creation of equi-depth histogram for selected data types.

Support int/bigint/float4/float8, time/timestamp/date and boolean.

Build the histogram from the t-digest byte array and serialize as JSON string.

More changes for serialization/deserialization.

Add code-gen stubs (empty) for VarChar/VarBinary types.

Address review comments (part 1). Add unit test.

Address review comments (part 2) for sampling.

close apache/drill#1715

  1. … 15 more files in changeset.
DRILL-7046: Support for loading and parsing new RM config file closes #1652

  1. … 63 more files in changeset.
DRILL-6969: Fix inconsistency of reading MaprDB JSON tables using hive plugin when native reader is enabled

closes #1610

  1. … 9 more files in changeset.
DRILL-6050: Provide a limit to number of rows fetched for a query in UI

Currently, the WebServer side needs to process the entire set of results and stream it back to the WebClient.

Since the WebUI does paginate results, we can load a larger set for pagination on the browser client and relieve pressure off the WebServer to host all the data (most of which will never be streamed to the browser).

e.g. Fetching all rows from a 1Billion records table is impractical and can be capped at (say) 1K. Currently, the user has to explicitly specify LIMIT in the submitted query.

An option is provided in the field to allow for this entry, and can be set to selected by default for the Web UI.

The submitted query indicates that an auto-limiting wrapper was applied.

[Update #1] Updated as per comments

1. Limit Wrapping Unchecked by default

2. Full List configuration of results

[Update #2] Minor update

[Update #3] Followup

closes #1593

  1. … 5 more files in changeset.