Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-4445: Standardize the Physical and Logical plan nodes to use Lists instead of arrays for their inputs

Remove some extra translation logic used to move between the

two representations.

TODO - look back the the Join logical node, has two JsonCreator annotations,

but only one will be used. Not sure if the behavior of which is chosen

is considered documented behavior, should just fix it on our end.

  1. … 26 more files in changeset.
DRILL-4260: Adding support for some custom window frames

this includes the following JIRAs:

DRILL-4261: Add support for RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

DRILL-4262: add support for ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

DRILL-4263: add support for RANGE BETWEEN CURRENT ROW AND CURRENT ROW

this closes #340

  1. … 22 more files in changeset.
DRILL-3012: Fix issue where remote values rel wasn't losing operatorId.

Also enhance Union rule to avoid more than 2 ways inputs

  1. … 3 more files in changeset.
DRILL-2936: Use SpoolingRawBatchBuffer for HashToMergeExchange In order to avoid deadlocks

Refactored common code in UnlimitedRawBatchBuffer and SpoolingRawBatchBuffer

into BaseRawBatchBuffer

Removed reflection-based construction of RawBatchBuffer. Now use choose implementation

based on plan

Updated SpoolingRawBatchBuffer to use a separate thread for spooling

  1. … 13 more files in changeset.
DRILL-2695: Add Support for large in conditions through the use of the Values operator. Update JSON reader to support reading Extended JSON. Update JSON writer to support writing extended JSON data. Update JSON reader to automatically unwrap a file that includes a single top-level array (used by values). Update Options manager to use getOption(<Type>Validator) to directly retrieve typed value. Remove JSON rewinding Add support for CONVERT_TO( [], 'SIMPLEJSON') to disable extended types as part of udf use.

  1. … 65 more files in changeset.
DRILL-2210 Introducing multithreading capability to PartitonerSender

  1. … 15 more files in changeset.
DRILL-133: LocalExchange planning and exec.

    • -0
    • +143
    ./AbstractDeMuxExchange.java
    • -0
    • +125
    ./AbstractMuxExchange.java
  1. … 55 more files in changeset.
DRILL-2715: Implement nested loop join operator

    • -0
    • +105
    ./NestedLoopJoinPOP.java
  1. … 9 more files in changeset.
1908: new window function implementation

  1. … 25 more files in changeset.
DRILL-1846: Use max receiver width during stats collection for parallelism planning. SingleMergeExchange and UnionExchange have a max receive width of 1. Others can go higher.

  1. … 4 more files in changeset.
DRILL-1846: Use max receiver width during stats collection for parallelism planning. SingleMergeExchange and UnionExchange have a max receive width of 1. Others can go higher.

  1. … 4 more files in changeset.
DRILL-1517: Update Foreman to improve state management.

  1. … 52 more files in changeset.
DRILL-1333: Flatten operator for allowing more complex queryies against repeated data.

  1. … 34 more files in changeset.
DRILL-1384: Part 1 - Rebase on Calcite. Change code due to Calcite package renaming/re-structure.

Optiq changed to use DATETIME_PLUS. Have to handle it in Drill.

PushFilterPastJoinRule has some issue. Temp fix for that.

Failed unit tests:

1) TestFlatten

2) TestConvertFunctions / TestComplexTypeWriter : "Concat"

3) TPCH Q16 : CanNotPlanException

Feed a RelDataTypeSystem into planner, to support decimal with precision/scale up to 38.

Remove assertion in DrillFilterRel. Optiq/Calcite could create a TRUE AND TRUE for query like WHERE col1 in (select ...) and col2 in (select ...) .

Rebase on calcite-1.1.0-drill-test-r1. Change code due to Calcite package renaming/re-structure.

Rebase on calcite : remaing with perl script. Part 1

reverse change to jdbc test.

Renaming for rebasing calcite. Part 2

Renaming for calcite rebasing. Part 3

Renaming for calcite rebasing. Part 4

Reverse change to testcase in jdbc.

Renaming for calcite rebasing. Part 5

Renaming for calcite rebasing. Part 6

remove 1.sh

WindowRel change related.

Renaming for calcite rebase. Part 7

PreprocessLogical and AggPrelBase

Renaming for calcite rebasing. Part 8. More manual change

Rebasing Calcite. Part 9

Rebasing calcite. Part 10

Rebasing API change from Calcite.

SQL parser change, due to Calcite rebasing.

Renaming change for calcite rebasing.

Renaming package due to Calcite rebasing.

Renaming package due to Calicte Rebase.

Work in progress for calcite rebasing.

Change import package names due to Calcite rebase.

Code refactor due to Calcite rebasing.

Fix bug in DistributionTraitDef.

Resolve compiler error, due to Calcite Rebasing.

Resolve compiler error after Calcite Rebasing.

minor change.

  1. … 259 more files in changeset.
Patch for DRILL-705

Currently only supports partitioning/ordering, not yet preceding or

after offsets

  1. … 77 more files in changeset.
DRILL-1402: Add check-style rules for trailing space, TABs and blocks without braces

  1. … 439 more files in changeset.
DRILL-634: Cleanup/organize Java imports and trailing whitespaces from Drill code

  1. … 762 more files in changeset.
DRILL-1328: Support table statistics - Part 2

Add support for avg row-width and major type statistics.

Parallelize the ANALYZE implementation and stats UDF implementation to improve stats collection performance.

Update/fix rowcount, selectivity and ndv computations to improve plan costing.

Add options for configuring collection/usage of statistics.

Add new APIs and implementation for stats writer (as a precursor to Drill Metastore APIs).

Fix several stats/costing related issues identified while running TPC-H nad TPC-DS queries.

Add support for CPU sampling and nested scalar columns.

Add more testcases for collection and usage of statistics and fix remaining unit/functional test failures.

Thanks to Venki Korukanti (@vkorukanti) for the description below (modified to account for new changes). He graciously agreed to rebase the patch to latest master, fixed few issues and added few tests.

FUNCS: Statistics functions as UDFs:

Separate

Currently using FieldReader to ensure consistent output type so that Unpivot doesn't get confused. All stats columns should be Nullable, so that stats functions can return NULL when N/A.

* custom versions of "count" that always return BigInt

* HyperLogLog based NDV that returns BigInt that works only on VarChars

* HyperLogLog with binary output that only works on VarChars

OPS: Updated protobufs for new ops

OPS: Implemented StatisticsMerge

OPS: Implemented StatisticsUnpivot

ANALYZE: AnalyzeTable functionality

* JavaCC syntax more-or-less copied from LucidDB.

* (Basic) AnalyzePrule: DrillAnalyzeRel -> UnpivotPrel StatsMergePrel FilterPrel(for sampling) StatsAggPrel ScanPrel

ANALYZE: Add getMetadataTable() to AbstractSchema

USAGE: Change field access in QueryWrapper

USAGE: Add getDrillTable() to DrillScanRelBase and ScanPrel

* since ScanPrel does not inherit from DrillScanRelBase, this requires adding a DrillTable to the constructor

* This is done so that a custom ReflectiveRelMetadataProvider can access the DrillTable associated with Logical/Physical scans.

USAGE: Attach DrillStatsTable to DrillTable.

* DrillStatsTable represents the data scanned from a corresponding ".stats.drill" table

* In order to avoid doing query execution right after the ".stats.drill" table is found, metadata is not actually collected until the MaterializationVisitor is used.

** Currently, the metadata source must be a string (so that a SQL query can be created). Doing this with a table is probably more complicated.

** Query is set up to extract only the most recent statistics results for each column.

closes #729

    • -0
    • +69
    ./StatisticsMerge.java
  1. … 141 more files in changeset.
DRILL-1329: External sort memory fixes

  1. … 38 more files in changeset.
DRILL-1078: Added metrics to unordered receiver (+ renaming) and merging receiver.

  1. … 5 more files in changeset.
DRILL-1055: Add ProducerConsumer operator to scans

This can be disabled. The queue size is configurable

    • -0
    • +58
    ./ProducerConsumer.java
  1. … 13 more files in changeset.
DRILL-1069: Rename RandomReceiver to UnorderedRecevier.

    • -0
    • +64
    ./UnorderedReceiver.java
  1. … 7 more files in changeset.
DRILL-1022: Increase default min hash table size and allow setting min/max size for hash table.

  1. … 8 more files in changeset.
DRILL-836: [addendum] Drill needs to return complex types (e.g., map and array) as a JSON string

* This contains additional changes to the original patch which was merged.

+ Renamed "flatten" to "complex-to-json"

+ With the new patch, we return VARCHAR instead of VARBINARY.

+ Added test case.

+ Minor code re-factoring.

    • -0
    • +59
    ./ComplexToJson.java
  1. … 37 more files in changeset.
Merge fixes.

  1. … 19 more files in changeset.
DRILL-933: Remove old physical operator cost & size concepts, add automatic size-based parallelization

  1. … 81 more files in changeset.
DRILL-836: Drill needs to return complex types (e.g., map and array) as a JSON string

  1. … 17 more files in changeset.
DRILL-968: Use checkstyle plugin to prevent inadvertent use of shaded Guava classes

+ Disallow non-static '*' imports in handwritten code.

+ Updated the current code to be in compliance.

+ Run 'rat' plugin in 'validate' phase.

  1. … 102 more files in changeset.
DRILL-600: Support planning for Union-All. Added infrastructure for planning Union-Distinct (not enabled yet).

  1. … 18 more files in changeset.
Remove references to jcommander's copy of Guava's Lists class.

  1. … 31 more files in changeset.