Clone Tools
  • last updated 28 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-2826: Simplify and centralize Operator Cleanup

- Remove cleanup method from RecordBatch interface

- Make OperatorContext creation and closing the management of FragmentContext

- Make OperatorContext an abstract class and the impl only available to FragmentContext

- Make RecordBatch closing the responsibility of the RootExec

- Make all closes be suppresing closes to maximize memory release in failure

- Add new CloseableRecordBatch interface used by RootExec

- Make RootExec AutoCloseable

- Update RecordBatchCreator to return CloseableRecordBatches so that RootExec can maintain list

- Generate list of operators through change in ImplCreator

    • -1
    • +2
    ./exec/store/hbase/HBaseScanBatchCreator.java
  1. … 95 more files in changeset.
DRILL-2567: CONVERT_FROM in where clause cause the query to fail in planning phase

Set the writeIndex of ByteBuf returned by Unpooled.wrappedBuffer() to 0.

+ Added a unit test to exercise the code path.

  1. … 1 more file in changeset.
DRILL-2567: CONVERT_FROM in where clause cause the query to fail in planning phase

Set the writeIndex of ByteBuf returned by Unpooled.wrappedBuffer() to 0.

+ Added a unit test to exercise the code path.

  1. … 1 more file in changeset.
DRILL-2514: Add support for impersonation in FileSystem storage plugin.

    • -5
    • +12
    ./exec/store/hbase/HBaseGroupScan.java
    • -1
    • +2
    ./exec/store/hbase/HBasePushFilterIntoScan.java
    • -2
    • +2
    ./exec/store/hbase/HBaseSchemaFactory.java
    • -5
    • +5
    ./exec/store/hbase/HBaseStoragePlugin.java
  1. … 66 more files in changeset.
DRILL-2413: FileSystemPlugin refactoring: avoid sharing DrillFileSystem across schemas

    • -1
    • +1
    ./exec/store/hbase/HBaseSchemaFactory.java
    • -1
    • +1
    ./exec/store/hbase/HBaseStoragePlugin.java
  1. … 32 more files in changeset.
DRILL-1690: Issue with using HBase plugin to access row_key only

    • -18
    • +21
    ./exec/store/hbase/HBaseRecordReader.java
DRILL-2173: Partition queries to drive dynamic pruning

Adds new interface on the QueryContext as well as individual schemas for exploring partitions of tables.

Adds injectable type for partition explorer for use in UDFs. This is hooked into both to expression

materialization and interpreted evaluation. The FragmentContext throws an exception to tell users to turn on

constant folding if a UDF that uses the PartitionExplorer makes it past planning.

2173 update -Address Chris' review comments.

Change the PartitionExplorer to return an Iterable<String> instead of String[]

Add interface level description to PartitionExplorer and StoragePluginPartitionExplorer.

New inner class in FileSystemPlugin to fulfill the new Iterable interface for partitions.

Formatting/cleanup fixes

Clean up error reporting code in MaxDir UDF. Remove method to get a string from a DrillBuf, as it was already defined in StringFunctionHelpers. Add new utility method to specifically convert a VarCharHolder to a string to remove some boilerplate.

Fixed an errant copy paste in a comment and removed unused imports.

Fix docs in FileSystemPlugin, belongs with the 2173 changes.

Fix references in Javadoc to properly use @link instead of @see.

2173 fixes, correctly return an empty list of sub-partitions if the path requested in the partition explorer interface is a file. Fix a few docs.

More 2173, finishing Chris' comments

2173 update - Add validation for PartitionExplorer injectable in UdfUtiltiers.

small change to fix refactored unit tests.

cleanup for 2173

Fix maxdir UDF so it can compile in runtime generated code as well as the interpreted expression system (needed to fully qualify classes and interfaces). It still fails to execute, as we prevent requesting a schema from a non-root fragment. We do not expect these types of functions to ever be used without constant folding so this should not be an issue.

Update error message in the case where the partition explorer is being used outside of planning.

Adding free marker generated maxdir, imaxdir, mindir and imindir

remove import that violates build checks, fix typo in new test class name

Separate out SubDirectoryList from FileSystemSchemaFactory.

Fix unit test to correctly test all four functions.

Update partition explorer to take List instead of Collection. As the lists are used in parallel it should be explicit that these are expected to be ordered (which Collections do not guarantee).

Drop the extra file generated due to the header in the free marker template and fix a typo and remove an unused import.

    • -1
    • +1
    ./exec/store/hbase/HBaseSchemaFactory.java
  1. … 19 more files in changeset.
DRILL-2090: Update HBase storage plugin to support HBase 0.98

    • -2
    • +2
    ./exec/store/hbase/HBaseFilterBuilder.java
    • -31
    • +25
    ./exec/store/hbase/HBaseRecordReader.java
    • -4
    • +11
    ./exec/store/hbase/TableStatsCalculator.java
  1. … 6 more files in changeset.
DRILL-1960: Automatic reallocation

    • -8
    • +2
    ./exec/store/hbase/HBaseRecordReader.java
  1. … 64 more files in changeset.
DRILL-1947: Cache PStore/EStore instances rather than recreating on each need. As part of this, make sure that PStoreConfig doesn't use identity equality.

    • -0
    • +4
    ./exec/store/hbase/config/HBasePStore.java
  1. … 17 more files in changeset.
DRILL-1917: Limit the number of results from HBasePStore.iterator() to MaxIteratorSize in BLOB_PERSISTENT mode

*** Wishing everyone a Happy New Year ***

    • -3
    • +12
    ./exec/store/hbase/config/HBasePStore.java
DRILL-1900: Fix numeric overflow problem in hbase stat calculation.

    • -1
    • +1
    ./exec/store/hbase/HBaseGroupScan.java
    • -3
    • +6
    ./exec/store/hbase/TableStatsCalculator.java
DRILL-1684, DRILL-1517, DRILL-1350: Profile and cancellation updates - Remove any storage of persisted profiles. - Store a separate query info object for active queries. - Update cancellation and running profile loading to query foreman server. - Make file store support HDFS APIs - Update PStoreProvider to use configuration to decide if you want PERSISTENT, EPHEMERAL, or BLOB storage rather than separate interfaces. - Update ZkPStore's persistent mode to leverage a cache and respond to changes rather than actively probing values. - Update ZkPStore's cache to be effectively write-through. - Automatically delete deprecated or default value options from PStore.

    • -12
    • +5
    ./exec/store/hbase/config/HBasePStore.java
  1. … 41 more files in changeset.
DRILL-1339: Use EStore to track running query status.

Move common code to ZkAbstractStore.

Get full profile from foreman directly.

code cleanup.

code change based on review comments.

ZK store check node exists before delete. Add error message in case of error.

Use a different profile for running queries, so that running query would have different ZK node from completed queries.

More log. Do not delete query state in EStore. In stead, modify the state in EStore.

  1. … 18 more files in changeset.
DRILL-1508: Implement pushdown for LIKE operator in HBase storage engine

    • -5
    • +63
    ./exec/store/hbase/HBaseFilterBuilder.java
    • -0
    • +165
    ./exec/store/hbase/HBaseRegexParser.java
  1. … 6 more files in changeset.
DRILL-1567 - Pushdown fails with WHERE clause with more than one AND or OR operator of same type

    • -8
    • +11
    ./exec/store/hbase/HBaseFilterBuilder.java
DRILL-1384: Part 1 - Rebase on Calcite. Change code due to Calcite package renaming/re-structure.

Optiq changed to use DATETIME_PLUS. Have to handle it in Drill.

PushFilterPastJoinRule has some issue. Temp fix for that.

Failed unit tests:

1) TestFlatten

2) TestConvertFunctions / TestComplexTypeWriter : "Concat"

3) TPCH Q16 : CanNotPlanException

Feed a RelDataTypeSystem into planner, to support decimal with precision/scale up to 38.

Remove assertion in DrillFilterRel. Optiq/Calcite could create a TRUE AND TRUE for query like WHERE col1 in (select ...) and col2 in (select ...) .

Rebase on calcite-1.1.0-drill-test-r1. Change code due to Calcite package renaming/re-structure.

Rebase on calcite : remaing with perl script. Part 1

reverse change to jdbc test.

Renaming for rebasing calcite. Part 2

Renaming for calcite rebasing. Part 3

Renaming for calcite rebasing. Part 4

Reverse change to testcase in jdbc.

Renaming for calcite rebasing. Part 5

Renaming for calcite rebasing. Part 6

remove 1.sh

WindowRel change related.

Renaming for calcite rebase. Part 7

PreprocessLogical and AggPrelBase

Renaming for calcite rebasing. Part 8. More manual change

Rebasing Calcite. Part 9

Rebasing calcite. Part 10

Rebasing API change from Calcite.

SQL parser change, due to Calcite rebasing.

Renaming change for calcite rebasing.

Renaming package due to Calcite rebasing.

Renaming package due to Calicte Rebase.

Work in progress for calcite rebasing.

Change import package names due to Calcite rebase.

Code refactor due to Calcite rebasing.

Fix bug in DistributionTraitDef.

Resolve compiler error, due to Calcite Rebasing.

Resolve compiler error after Calcite Rebasing.

minor change.

    • -3
    • +3
    ./exec/store/hbase/DrillHBaseTable.java
    • -3
    • +3
    ./exec/store/hbase/HBasePushFilterIntoScan.java
    • -3
    • +3
    ./exec/store/hbase/HBaseSchemaFactory.java
    • -1
    • +1
    ./exec/store/hbase/HBaseStoragePlugin.java
  1. … 258 more files in changeset.
DRILL-1433: Fixing Where query on HBase store when row_key doesn't exist in HBase

    • -4
    • +6
    ./exec/store/hbase/TableStatsCalculator.java
DRILL-1371: HBase queries fail when hbase.scan.sizecalculator.enabled is set to false

    • -1
    • +1
    ./exec/store/hbase/HBaseGroupScan.java
DRILL-1414: Move profile storage to DFS rather than using PStore

    • -5
    • +23
    ./exec/store/hbase/config/HBasePStore.java
  1. … 12 more files in changeset.
DRILL-1426: Add IO wait stats to HBaseRecordReader

    • -1
    • +10
    ./exec/store/hbase/HBaseRecordReader.java
  1. … 1 more file in changeset.
DRILL-1407: Add scan size calculator option to HBase storage plugin configuration

    • -1
    • +1
    ./exec/store/hbase/HBaseGroupScan.java
    • -1
    • +14
    ./exec/store/hbase/HBaseStoragePluginConfig.java
    • -8
    • +5
    ./exec/store/hbase/TableStatsCalculator.java
DRILL-1403: HBase predicate pushdown filters are not getting applied

    • -0
    • +6
    ./exec/store/hbase/HBaseFilterBuilder.java
DRILL-1402: Add check-style rules for trailing space, TABs and blocks without braces

    • -2
    • +4
    ./exec/store/hbase/HBaseGroupScan.java
  1. … 440 more files in changeset.
DRILL-634: Cleanup/organize Java imports and trailing whitespaces from Drill code

    • -1
    • +1
    ./exec/store/hbase/HBaseGroupScan.java
    • -2
    • +1
    ./exec/store/hbase/HBaseRecordReader.java
    • -0
    • +1
    ./exec/store/hbase/HBaseStoragePlugin.java
    • -2
    • +2
    ./exec/store/hbase/TableStatsCalculator.java
  1. … 765 more files in changeset.
DRILL-1346: Use HBase table size information to improve scan parallelization

    • -6
    • +14
    ./exec/store/hbase/HBaseGroupScan.java
    • -0
    • +179
    ./exec/store/hbase/TableStatsCalculator.java
  1. … 2 more files in changeset.
DRILL-1366: HBaseRecordReader does not set rowcount correctly if vectors run out of memory in the middle of the row.

    • -1
    • +1
    ./exec/store/hbase/HBaseRecordReader.java
DRILL-1309: Implement ProjectPastFilterPushdown and update DrillScanRel cost model so that exclusive column so that star query is more expensive than exclusive column projection. Various fixes affecting record reaaders to handle `*` column as well as fixes to some test cases.

exclude parquet files from rat check

    • -7
    • +8
    ./exec/store/hbase/HBaseGroupScan.java
    • -32
    • +36
    ./exec/store/hbase/HBaseRecordReader.java
    • -3
    • +7
    ./exec/store/hbase/HBaseScanBatchCreator.java
  1. … 30 more files in changeset.
DRILL-1328: Support table statistics - Part 2

Add support for avg row-width and major type statistics.

Parallelize the ANALYZE implementation and stats UDF implementation to improve stats collection performance.

Update/fix rowcount, selectivity and ndv computations to improve plan costing.

Add options for configuring collection/usage of statistics.

Add new APIs and implementation for stats writer (as a precursor to Drill Metastore APIs).

Fix several stats/costing related issues identified while running TPC-H nad TPC-DS queries.

Add support for CPU sampling and nested scalar columns.

Add more testcases for collection and usage of statistics and fix remaining unit/functional test failures.

Thanks to Venki Korukanti (@vkorukanti) for the description below (modified to account for new changes). He graciously agreed to rebase the patch to latest master, fixed few issues and added few tests.

FUNCS: Statistics functions as UDFs:

Separate

Currently using FieldReader to ensure consistent output type so that Unpivot doesn't get confused. All stats columns should be Nullable, so that stats functions can return NULL when N/A.

* custom versions of "count" that always return BigInt

* HyperLogLog based NDV that returns BigInt that works only on VarChars

* HyperLogLog with binary output that only works on VarChars

OPS: Updated protobufs for new ops

OPS: Implemented StatisticsMerge

OPS: Implemented StatisticsUnpivot

ANALYZE: AnalyzeTable functionality

* JavaCC syntax more-or-less copied from LucidDB.

* (Basic) AnalyzePrule: DrillAnalyzeRel -> UnpivotPrel StatsMergePrel FilterPrel(for sampling) StatsAggPrel ScanPrel

ANALYZE: Add getMetadataTable() to AbstractSchema

USAGE: Change field access in QueryWrapper

USAGE: Add getDrillTable() to DrillScanRelBase and ScanPrel

* since ScanPrel does not inherit from DrillScanRelBase, this requires adding a DrillTable to the constructor

* This is done so that a custom ReflectiveRelMetadataProvider can access the DrillTable associated with Logical/Physical scans.

USAGE: Attach DrillStatsTable to DrillTable.

* DrillStatsTable represents the data scanned from a corresponding ".stats.drill" table

* In order to avoid doing query execution right after the ".stats.drill" table is found, metadata is not actually collected until the MaterializationVisitor is used.

** Currently, the metadata source must be a string (so that a SQL query can be created). Doing this with a table is probably more complicated.

** Query is set up to extract only the most recent statistics results for each column.

closes #729

    • -0
    • +1
    ./exec/store/hbase/HBaseGroupScan.java
  1. … 143 more files in changeset.
DRILL-1329: External sort memory fixes

    • -0
    • +1
    ./exec/store/hbase/HBaseRecordReader.java
  1. … 38 more files in changeset.