Clone Tools
  • last updated 11 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-1407: Add scan size calculator option to HBase storage plugin configuration

    • -1
    • +1
    ./exec/store/hbase/HBaseGroupScan.java
    • -1
    • +14
    ./exec/store/hbase/HBaseStoragePluginConfig.java
    • -8
    • +5
    ./exec/store/hbase/TableStatsCalculator.java
DRILL-1403: HBase predicate pushdown filters are not getting applied

    • -0
    • +6
    ./exec/store/hbase/HBaseFilterBuilder.java
DRILL-1402: Add check-style rules for trailing space, TABs and blocks without braces

    • -2
    • +4
    ./exec/store/hbase/HBaseGroupScan.java
  1. … 440 more files in changeset.
DRILL-634: Cleanup/organize Java imports and trailing whitespaces from Drill code

    • -1
    • +1
    ./exec/store/hbase/HBaseGroupScan.java
    • -2
    • +1
    ./exec/store/hbase/HBaseRecordReader.java
    • -0
    • +1
    ./exec/store/hbase/HBaseStoragePlugin.java
    • -2
    • +2
    ./exec/store/hbase/TableStatsCalculator.java
  1. … 765 more files in changeset.
DRILL-1346: Use HBase table size information to improve scan parallelization

    • -6
    • +14
    ./exec/store/hbase/HBaseGroupScan.java
    • -0
    • +179
    ./exec/store/hbase/TableStatsCalculator.java
  1. … 2 more files in changeset.
DRILL-1366: HBaseRecordReader does not set rowcount correctly if vectors run out of memory in the middle of the row.

    • -1
    • +1
    ./exec/store/hbase/HBaseRecordReader.java
DRILL-1309: Implement ProjectPastFilterPushdown and update DrillScanRel cost model so that exclusive column so that star query is more expensive than exclusive column projection. Various fixes affecting record reaaders to handle `*` column as well as fixes to some test cases.

exclude parquet files from rat check

    • -7
    • +8
    ./exec/store/hbase/HBaseGroupScan.java
    • -32
    • +36
    ./exec/store/hbase/HBaseRecordReader.java
    • -3
    • +7
    ./exec/store/hbase/HBaseScanBatchCreator.java
  1. … 30 more files in changeset.
DRILL-1328: Support table statistics - Part 2

Add support for avg row-width and major type statistics.

Parallelize the ANALYZE implementation and stats UDF implementation to improve stats collection performance.

Update/fix rowcount, selectivity and ndv computations to improve plan costing.

Add options for configuring collection/usage of statistics.

Add new APIs and implementation for stats writer (as a precursor to Drill Metastore APIs).

Fix several stats/costing related issues identified while running TPC-H nad TPC-DS queries.

Add support for CPU sampling and nested scalar columns.

Add more testcases for collection and usage of statistics and fix remaining unit/functional test failures.

Thanks to Venki Korukanti (@vkorukanti) for the description below (modified to account for new changes). He graciously agreed to rebase the patch to latest master, fixed few issues and added few tests.

FUNCS: Statistics functions as UDFs:

Separate

Currently using FieldReader to ensure consistent output type so that Unpivot doesn't get confused. All stats columns should be Nullable, so that stats functions can return NULL when N/A.

* custom versions of "count" that always return BigInt

* HyperLogLog based NDV that returns BigInt that works only on VarChars

* HyperLogLog with binary output that only works on VarChars

OPS: Updated protobufs for new ops

OPS: Implemented StatisticsMerge

OPS: Implemented StatisticsUnpivot

ANALYZE: AnalyzeTable functionality

* JavaCC syntax more-or-less copied from LucidDB.

* (Basic) AnalyzePrule: DrillAnalyzeRel -> UnpivotPrel StatsMergePrel FilterPrel(for sampling) StatsAggPrel ScanPrel

ANALYZE: Add getMetadataTable() to AbstractSchema

USAGE: Change field access in QueryWrapper

USAGE: Add getDrillTable() to DrillScanRelBase and ScanPrel

* since ScanPrel does not inherit from DrillScanRelBase, this requires adding a DrillTable to the constructor

* This is done so that a custom ReflectiveRelMetadataProvider can access the DrillTable associated with Logical/Physical scans.

USAGE: Attach DrillStatsTable to DrillTable.

* DrillStatsTable represents the data scanned from a corresponding ".stats.drill" table

* In order to avoid doing query execution right after the ".stats.drill" table is found, metadata is not actually collected until the MaterializationVisitor is used.

** Currently, the metadata source must be a string (so that a SQL query can be created). Doing this with a table is probably more complicated.

** Query is set up to extract only the most recent statistics results for each column.

closes #729

    • -0
    • +1
    ./exec/store/hbase/HBaseGroupScan.java
  1. … 143 more files in changeset.
DRILL-1329: External sort memory fixes

    • -0
    • +1
    ./exec/store/hbase/HBaseRecordReader.java
  1. … 38 more files in changeset.
DRILL-1281: Read into Direct Memory in Parquet Reader. Requires Hadoop 2.4 or above

    • -0
    • +14
    ./exec/store/hbase/HBaseRecordReader.java
  1. … 28 more files in changeset.
DRILL-933: Remove old physical operator cost & size concepts, add automatic size-based parallelization

    • -14
    • +5
    ./exec/store/hbase/HBaseGroupScan.java
    • -12
    • +0
    ./exec/store/hbase/HBaseSubScan.java
  1. … 94 more files in changeset.
DRILL-956: Fix issue where LocalPStore provider doesn't have correct Constructor signature

  1. … 4 more files in changeset.
DRILL-943 - Enable/disable Storage Plugin Instance

    • -1
    • +1
    ./exec/store/hbase/HBaseStoragePluginConfig.java
  1. … 9 more files in changeset.
DRILL-904: Fixes in project push down are causing HBase Filter pushdown to plan indefinitely.

    • -0
    • +1
    ./exec/store/hbase/HBaseGroupScan.java
    • -3
    • +1
    ./exec/store/hbase/HBaseStoragePlugin.java
Update projection pushdown so that it rewrites row type of scan.

    • -1
    • +3
    ./exec/store/hbase/HBaseStoragePlugin.java
  1. … 12 more files in changeset.
add digest of group scan to scan rel.

    • -2
    • +1
    ./exec/store/hbase/HBaseGroupScan.java
  1. … 11 more files in changeset.
Adding HBase Persistent Store.

+ Modified some interfaces and configuration keys.

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java

    • -0
    • +10
    ./exec/store/hbase/HBaseStoragePluginConfig.java
    • -0
    • +212
    ./exec/store/hbase/config/HBasePStore.java
    • -0
    • +115
    ./exec/store/hbase/config/HBasePStoreProvider.java
  1. … 40 more files in changeset.
DRILL-672: Queries against HBase table do not close after the data is returned.

    • -42
    • +173
    ./exec/store/hbase/HBaseGroupScan.java
    • -3
    • +1
    ./exec/store/hbase/HBaseRecordReader.java
    • -1
    • +13
    ./exec/store/hbase/HBaseSubScan.java
  1. … 3 more files in changeset.
DRILL-825: MaterializedField is mutable and is not suitable as a KEY in a MAP

+ Minor optimization/cleanup in HBaseRecordReader

    • -10
    • +7
    ./exec/store/hbase/HBaseRecordReader.java
  1. … 2 more files in changeset.
DRILL-680: INFORMATION_SCHEMA.COLUMNS does not display HBase column families

+ Enhanced result layout with option to set output width on per column basis.

+ Pretty print plan fragments.

    • -0
    • +62
    ./exec/store/hbase/DrillHBaseTable.java
    • -3
    • +3
    ./exec/store/hbase/HBaseSchemaFactory.java
  1. … 6 more files in changeset.
DRILL-781: Use MapVector as the top level vector for HBase Column Families

    • -108
    • +52
    ./exec/store/hbase/HBaseRecordReader.java
    • -1
    • +0
    ./exec/store/hbase/HBaseSchemaFactory.java
  1. … 4 more files in changeset.
DRILL-671: Select against hbase table with filter against row_key fails

    • -1
    • +2
    ./exec/store/hbase/HBaseFilterBuilder.java
  1. … 1 more file in changeset.
DRILL-783: Convert function support in HBase filter push down.

+ Enable HBase test suit (failures fixed by DRILL-761).

    • -0
    • +232
    ./exec/store/hbase/CompareFunctionsProcessor.java
    • -50
    • +24
    ./exec/store/hbase/HBaseFilterBuilder.java
  1. … 6 more files in changeset.
DRILL-757: Output mutator interface changes - Output mutator manages schema changes instead of record readers - Removed usages of deprecated interface

    • -4
    • +0
    ./exec/store/hbase/HBaseRecordReader.java
  1. … 13 more files in changeset.
DRILL-754: Scan of HBase table timed out

+ RAT: Ignore "*.patch" file in the project root directory.

    • -44
    • +45
    ./exec/store/hbase/HBaseRecordReader.java
  1. … 3 more files in changeset.
DRILL-604: Add schema type to INFORMATION_SCHEMA.SCHEMATA.

Modification for SubSchemaWrapper.

    • -0
    • +7
    ./exec/store/hbase/HBaseSchemaFactory.java
    • -1
    • +3
    ./exec/store/hbase/HBaseStoragePluginConfig.java
  1. … 15 more files in changeset.
DRILL-696: Use standard HBase configuration name/value in HBaseStoragePluginConfig

    • -23
    • +29
    ./exec/store/hbase/HBaseStoragePluginConfig.java
  1. … 7 more files in changeset.
DRILL-695: Push down column value predicates into HBase scan

    • -46
    • +169
    ./exec/store/hbase/HBaseFilterBuilder.java
    • -0
    • +23
    ./exec/store/hbase/HBaseGroupScan.java
    • -5
    • +29
    ./exec/store/hbase/HBasePushFilterIntoScan.java
    • -10
    • +5
    ./exec/store/hbase/HBaseRecordReader.java
  1. … 5 more files in changeset.
DRILL-683: Qualify HBase scan with specified columns even if row_key is required.

+ Added some log messages

    • -15
    • +6
    ./exec/store/hbase/HBaseRecordReader.java
    • -4
    • +3
    ./exec/store/hbase/HBaseScanBatchCreator.java
    • -4
    • +8
    ./exec/store/hbase/HBaseStoragePluginConfig.java
  1. … 1 more file in changeset.
DRILL-682: Improve HBase storage engine test execution

    • -0
    • +9
    ./exec/store/hbase/HBaseStoragePluginConfig.java
  1. … 13 more files in changeset.