Clone Tools
  • last updated 18 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7442: Create multi-batch row set reader

Adds a ResultSetReader that works across multiple batches

in a result set. Reuses the same row set and readers if

schema is unchanged, creates a new set if the schema changes.

Adds a unit test for the result set reader.

Adds a "rebind" capability to the row set readers to focus

on new buffers under an existing set of vectors. Used when

a new batch arrives, if the schema is unchanged.

Extends row set classses to be aware of the BatchAccessor class

which encapsulates a container and optional selection vector,

and tracks schema changes.

Moves row set tests into the same package as the row sets.

(Row set classes were moved a while back, but the tests were

not moved.)

Renames some BatchAccessor methods.

closes #1897

  1. … 62 more files in changeset.
DRILL-7314: Use TupleMetadata instead of concrete implementation

1. Add ser / de implementation for TupleMetadata interface based on types.

2. Replace TupleSchema usage where possible.

3. Move patcher classes into commons.

4. Upgrade some dependencies and general refactoring.

  1. … 39 more files in changeset.
DRILL-7188: Revert DRILL-6642: Update protocol-buffers version

1. Updated protobuf to version 3.6.1

2. Added protobuf to the root pom dependency management

3. Added classes BoundedByteString and LiteralByteString for compatibility with HBase

4. Added ProtobufPatcher to provide compatibility with MapR-DB and HBase

  1. … 40 more files in changeset.
DRILL-7049: REST API returns the toString of byte arrays (VARBINARY types)

closes #1739

  1. … 1 more file in changeset.
DRILL-7148: Use improved join cardinality and ndv estimation with statistics

closes #1744

  1. … 11 more files in changeset.
DRILL-7155: Create a standard logging message for batch sizes generated by individual operators. This is needed for QA verification of the Batch Size feature DRILL-6238. closes #1716

  1. … 10 more files in changeset.
DRILL-7011: Support schema in scan framework

* Adds schema support to the row set-based scan framework and to the "V3" text reader based on that framework.

* Adding the schema made clear that passing options as a long list of constructor arguments was not sustainable. Refactored code to use a builder pattern instead.

* Added support for default values in the "null column loader", which required adding a "setValue" method to the column accessors.

* Added unit tests for all new or changed functionality. See TestCsvWithSchema for the overall test of the entire integrated mechanism.

* Added tests for explicit projection with schema

* Better handling of date/time in column accessors

* Converted recent column metadata work from Java 8 date/time to Joda.

* Added more CSV-with-schema unit tests

* Removed the ID fields from "resolved columns", used "instanceof" instead.

* Added wildcard projection with an output schema. Handles both "lenient" and "strict" schemas.

* Tagged projection columns with their output schema, when available.

* Scan projection added modes for wildcard with an output schema. The reader projection added support for merging reader and output schemas.

* Includes refactoring of scan operator tests (the test file grew too large.)

* Renamed some classes to avoid confusing reader schemas with output schemas.

* Added unit tests for the new functionality.

* Added "lenient" wildcard with schema test for CSV

* Added more type conversions: string-to-bit, many-to-string

* Fixed bug in column writer for VarDecimal

* Added missing unit tests, and fixed bugs, in Bit column reader/writer

* Cleaned up a number of unneded "SuppressWarnings"

closes #1711

  1. … 224 more files in changeset.
DRILL-6852: Adapt current Parquet Metadata cache implementation to use Drill Metastore API

Co-authored-by: Volodymyr Vysotskyi <vvovyk@gmail.com>

Co-authored-by: Vitalii Diravka <vitalii@apache.org>

close apache/drill#1646

  1. … 66 more files in changeset.
DRILL-7068: Support memory adjustment framework for resource management with Queues. closes #1677

    • -20
    • +26
    ./MemoryAllocationUtilities.java
  1. … 37 more files in changeset.
DRILL-7200: Update Calcite to 1.19.0 / 1.20.0

  1. … 46 more files in changeset.
DRILL-7049 return VARBINARY as a string with escaped non printable bytes

DRILL-5603: Replace String file paths to Hadoop Path - replaced all String path representation with org.apache.hadoop.fs.Path - added PathSerDe.Se JSON serializer - refactoring of DFSPartitionLocation code by leveraging existing listPartitionValues() functionality

closes #1657

  1. … 83 more files in changeset.
DRILL-6931: File listing: fix issue for S3 directory objects and improve performance for recursive listing closes #1590

  1. … 2 more files in changeset.
DRILL-6858: Add functionality to list directories / files with exceptions suppression

1. Add listDirectoriesSafe, listFilesSafe, listAllSafe in FileSystemUtil and DrillFileSystemUtil classes.

2. Use FileSystemUtil.listAllSafe during listing files in show files command and information_schema.files table.

closes #1547

  1. … 6 more files in changeset.
DRILL-6850: Force setting DRILL_LOGICAL Convention for DrillRelFactories and DrillFilterRel

- Fix workspace case insensitivity for JDBC storage plugin

  1. … 13 more files in changeset.
DRILL-6642: Update protocol-buffers version

1. Updated protobuf to version 3.6.1

2. Added protobuf to the root pom dependency management

3. Added classes BoundedByteString and LiteralByteString for compatibility with HBase

4. Added ProtobufPatcher to provide compatibility with MapR-DB and HBase

closes #1639

    • -0
    • +186
    ./ProtobufPatcher.java
  1. … 40 more files in changeset.
DRILL-6410: Fixed memory leak in flat Parquet reader

    • -0
    • +194
    ./concurrent/ExecutorServiceUtil.java
  1. … 3 more files in changeset.
DRILL-6762: Fix dynamic UDFs versioning issue

1. Added UndefinedVersionDelegatingStore to serve as versioned wrapper for those stores that do not support versioning.

2. Aligned remote and local function registries version type. Type will be represented as int since ZK version is returned as int.

3. Added NOT_AVAILABLE and UNDEFINED versions to DataChangeVersion holder to indicate proper registry state.

4. Added additional trace logging.

5. Minor refactoring and clean up.

closes #1484

  1. … 21 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

    • -2
    • +2
    ./filereader/BufferedDirectBufInputStream.java
    • -2
    • +2
    ./filereader/DirectBufInputStream.java
  1. … 976 more files in changeset.
DRILL-6709: Extended the batch stats utility to other operators

closes #1444

    • -13
    • +105
    ./record/RecordBatchStats.java
  1. … 15 more files in changeset.
DRILL-6656: Disallow extra semicolons and multiple statements on the same line.

closes #1415

  1. … 144 more files in changeset.
DRILL-6544: Allow timestamp / date / time formatting when displaying on Web UI

Added the following options that are setting the format pattens:

web.timestamp.display_format, web.date.display_format, web.time.display_format.

See https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html for the details about acceptable values.

Default formatting is used, if the corresponding option has empty string.

    • -0
    • +121
    ./ValueVectorElementFormatter.java
  1. … 6 more files in changeset.
DRILL-6494: Drill Plugins Handler

- Storage Plugins Handler service is used op the Drill start-up stage and it updates storage plugins configs from

storage-plugins-override.conf file. If plugins configs are present in the persistence store - they are updated,

otherwise bootstrap plugins are updated and the result configs are loaded to persistence store. If the enabled

status is absent in the storage-plugins-override.conf file, the last plugin config enabled status persists.

- 'drill.exec.storage.action_on_plugins_override_file' Boot option is added. This is the action, which should be

performed on the storage-plugins-override.conf file after successful updating storage plugins configs.

Possible values are: "none" (default), "rename" and "remove".

- The "NULL" issue with updating Hive plugin config by REST is solved. But clients are still being instantiated for disabled

plugins - DRILL-6412.

- "org.honton.chas.hocon:jackson-dataformat-hocon" library is added for the proper deserializing HOCON conf file

- additional refactoring: "com.typesafe:config" and "org.apache.commons:commons-lang3" are placed into DependencyManagement

block with proper versions; correct properties for metrics in "drill-override-example.conf" are specified

closes #1345

  1. … 34 more files in changeset.
DRILL-6560: Enhanced the batch statistics logging enablement

closes #1355

    • -62
    • +119
    ./record/RecordBatchStats.java
  1. … 9 more files in changeset.
DRILL-6496: Added print methods for debugging tests, and fixed missing log statement in VectorUtils.

closes #1336

    • -0
    • +29
    ./CheckedSupplier.java
  1. … 32 more files in changeset.
DRILL-6526: Refactor FileSystemConfig to disallow direct access from the code to its variables

  1. … 10 more files in changeset.
DRILL-6147: Adding Columnar Parquet Batch Sizing functionality

closes #1330

    • -0
    • +225
    ./record/RecordBatchStats.java
  1. … 43 more files in changeset.
DRILL-6438: Remove excess logging form the tests. - Removed usages of System.out and System.err from the test and replaced with loggers

closes #1284

  1. … 90 more files in changeset.
DRILL-6353: Upgrade Parquet MR dependencies

closes #1259

    • -4
    • +4
    ./filereader/BufferedDirectBufInputStream.java
    • -3
    • +8
    ./filereader/DirectBufInputStream.java
  1. … 18 more files in changeset.
DRILL-6386: Remove unused imports and star imports.

  1. … 231 more files in changeset.