Clone Tools
  • last updated 22 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-3993: Changes after code review.

  1. … 4 more files in changeset.
DRILL-3993: Changes to support Calcite 1.15.

Fix AssertionError: type mismatch for tests with aggregate functions.

Fix VARIANCE agg function

Remove using deprecated Subtype enum

Fix 'Failure while loading table a in database hbase' error

Fix 'Field ordinal 1 is invalid for type '(DrillRecordRow[*])'' unit test failures

  1. … 17 more files in changeset.
DRILL-5337: OpenTSDB storage plugin

closes #774

    • -0
    • +32
    ./store/openTSDB/Constants.java
    • -0
    • +81
    ./store/openTSDB/DrillOpenTSDBTable.java
    • -0
    • +53
    ./store/openTSDB/OpenTSDBBatchCreator.java
    • -0
    • +169
    ./store/openTSDB/OpenTSDBGroupScan.java
    • -0
    • +258
    ./store/openTSDB/OpenTSDBRecordReader.java
    • -0
    • +42
    ./store/openTSDB/OpenTSDBScanSpec.java
    • -0
    • +77
    ./store/openTSDB/OpenTSDBStoragePlugin.java
    • -0
    • +77
    ./store/openTSDB/OpenTSDBStoragePluginConfig.java
    • -0
    • +132
    ./store/openTSDB/OpenTSDBSubScan.java
    • -0
    • +66
    ./store/openTSDB/Util.java
    • -0
    • +50
    ./store/openTSDB/client/OpenTSDB.java
    • -0
    • +28
    ./store/openTSDB/client/OpenTSDBTypes.java
    • -0
    • +124
    ./store/openTSDB/client/Schema.java
    • -0
    • +55
    ./store/openTSDB/client/Service.java
    • -0
    • +148
    ./store/openTSDB/client/query/DBQuery.java
  1. … 13 more files in changeset.
DRILL-1328: Support table statistics - Part 2

Add support for avg row-width and major type statistics.

Parallelize the ANALYZE implementation and stats UDF implementation to improve stats collection performance.

Update/fix rowcount, selectivity and ndv computations to improve plan costing.

Add options for configuring collection/usage of statistics.

Add new APIs and implementation for stats writer (as a precursor to Drill Metastore APIs).

Fix several stats/costing related issues identified while running TPC-H nad TPC-DS queries.

Add support for CPU sampling and nested scalar columns.

Add more testcases for collection and usage of statistics and fix remaining unit/functional test failures.

Thanks to Venki Korukanti (@vkorukanti) for the description below (modified to account for new changes). He graciously agreed to rebase the patch to latest master, fixed few issues and added few tests.

FUNCS: Statistics functions as UDFs:

Separate

Currently using FieldReader to ensure consistent output type so that Unpivot doesn't get confused. All stats columns should be Nullable, so that stats functions can return NULL when N/A.

* custom versions of "count" that always return BigInt

* HyperLogLog based NDV that returns BigInt that works only on VarChars

* HyperLogLog with binary output that only works on VarChars

OPS: Updated protobufs for new ops

OPS: Implemented StatisticsMerge

OPS: Implemented StatisticsUnpivot

ANALYZE: AnalyzeTable functionality

* JavaCC syntax more-or-less copied from LucidDB.

* (Basic) AnalyzePrule: DrillAnalyzeRel -> UnpivotPrel StatsMergePrel FilterPrel(for sampling) StatsAggPrel ScanPrel

ANALYZE: Add getMetadataTable() to AbstractSchema

USAGE: Change field access in QueryWrapper

USAGE: Add getDrillTable() to DrillScanRelBase and ScanPrel

* since ScanPrel does not inherit from DrillScanRelBase, this requires adding a DrillTable to the constructor

* This is done so that a custom ReflectiveRelMetadataProvider can access the DrillTable associated with Logical/Physical scans.

USAGE: Attach DrillStatsTable to DrillTable.

* DrillStatsTable represents the data scanned from a corresponding ".stats.drill" table

* In order to avoid doing query execution right after the ".stats.drill" table is found, metadata is not actually collected until the MaterializationVisitor is used.

** Currently, the metadata source must be a string (so that a SQL query can be created). Doing this with a table is probably more complicated.

** Query is set up to extract only the most recent statistics results for each column.

closes #729

    • -0
    • +1
    ./store/openTSDB/OpenTSDBGroupScan.java
  1. … 143 more files in changeset.