Clone Tools
  • last updated a few minutes ago
Constraints: committers
Constraints: files
Constraints: dates
DRILL-2060: Constant folding rule

2060 update - Constant folding work completed.

Fix issue with date, time and timestamp literal creation.

Fix literal creation during expression interpretation to match nullability of incoming expression.

Fix decimal literals in interpreted expression eval.

Disable test with an exposed planning bug when the project instance of the constant folding rule is enabled. The rule is not actually influencing the final plan when the rule is firing and making expression reductions. This is due to our current cost model fro project which just counts the number of expressions and does not consider expression complexity. The issues have been logged in DRILL-2218 for further investigation, they do not need to be solved to merge the other constant folding rules and all of the interpreted expression work that has been done.

Get rid of clutter in RuleSets, explanation has been moved to the 2218 JIRA.

Belongs with 2060, fix constant expression executor to use the new constant expression interpreter interface that returns a ValueHolder instead a ValueVector with a single value filled in.

2060 update - change test baseline due to new column ordering (no functional or performance impacting changes to plan)

2060 - address Aman's comments.

add test ignore - DRILL-2218

Baseline update for project pushdown test (only column ordering on a scan, no functional or performance impacting plan changes)

Turn back on project instance.

Small casting bug in constant executor.

Don't fold hive UDFs.

Modify DrillBuf to allow a BufferManager to be the owning context for a DrilllBuf.

TODO - refactor to remove remaining common code from OperatorContext and FragmentContext,

have them both use the new BufferManager.

Add system option for disabling constant folding.

2060 update - test option to disable constant folding.

Update RuleSets to actually allow turning the constant folding rules on and off as well as establish general pattern for turning logical rules on an off, similar to how some physical rules can be already.

Change the estimated row count in EasyGroupScan to report a number of files in the case where the file size indicates an estimated total count of 0 records. Allows very small files to be pruned.

Fix folding expressions that result in null after refactoring the interpreted expression evaluation to return a ValueHolder in the case of a constant expression. Previously a value vector was returned in the same manner as the interpreter can still do when given an input VectorAccessible and an expression that may contain fild references. Calling getObject on the output vector previously gracefully handled nulls as they were passed into the Calciate API to create literals. This process has to be a bit more manual now.

Address Jinfeng's review comments.

A few more review comments.

Disable cost calculation change, complete fix will come in 2553.

Throw a runtime exception of there is an error materializing the expression, as the same materialization will take place at query execution time we should fail early.

Add a test that does prune appropriately, still have a test for the outstanding issue tracked in DRILL-2553.

Small fix for test to properly set session option and set it back after completion.

Fixing comment that was garbled somehow.

small fix for case where expression returns a null result during constant folding.

Add a little defensive code to give a good error message if a type that does not appear in the mapping from Drill to Calcite types attempts to be folded into a null value.

  1. … 21 more files in changeset.
DRILL-1960: Automatic reallocation

  1. … 63 more files in changeset.
DRILL-2021: In ProjectRecordBatch, for the case where expr != ref, allow duplicates; when projecting, keep track of the used output names

  1. … 10 more files in changeset.
DRILL-1825: In ProjectRecordBatch, avoid requesting memory (will result in memory leak) when no output column is needed

  1. … 1 more file in changeset.
DRILL-1839: Clean up complex writers per batch if project has a ComplexWriter function evaluation

DRILL-1839: Clean up complex writers per batch if project has a ComplexWriter function evaluation

DRILL-1811: select A, * from ... crashes JVM

  1. … 1 more file in changeset.
DRILL-1811: select A, * from ... crashes JVM

  1. … 1 more file in changeset.
DRILL-1781: Fast Complex Schema

  1. … 44 more files in changeset.
DRILL-1828: In ProjectRecordBatch, add a case for expression classification

  1. … 1 more file in changeset.
DRILL-1382: Fast schema return

  1. … 75 more files in changeset.
Patch for DRILL-705

Currently only supports partitioning/ordering, not yet preceding or

after offsets

  1. … 77 more files in changeset.
DRILL-1402: Add check-style rules for trailing space, TABs and blocks without braces

  1. … 440 more files in changeset.
DRILL-634: Cleanup/organize Java imports and trailing whitespaces from Drill code

  1. … 769 more files in changeset.
Fix issue introduced by DRILL-1202 where allocators are being closed after reporting success. Update ScreenRoot to cleanup before returning success. Update ScanBatch to cleanup reader in case of limit query to avoid memory leak in ParquetReader. Update allocators so that we don't have memory leak when using debug options. Update project record batch so that it doesn't try to return a released remainder.

  1. … 6 more files in changeset.
DRILL-1329: External sort memory fixes

  1. … 38 more files in changeset.
DRILL-1310: Fix assertion in ProjectRecordBatch for certain types of star queries.

DRILL-1324: Add mechanism to detect schema changes when adding a new primitive vector in a Map, RepeatedMap, RepeatedList vector

  1. … 13 more files in changeset.
DRILL-1278: Fix selecting scalar field from a map with join clause.

DRILL-1293: Fix assertion when selecting star column from view that also has star column.

  1. … 1 more file in changeset.
DRILL-1220: Process the star columns in ProjectRecordBatch by classifying the expressions appropriately based on the annotated expressions created by planner.

  1. … 3 more files in changeset.
DRILL-1060: Support ComplexToJson for Array Data Type

  1. … 9 more files in changeset.
DRILL-836: [addendum] Drill needs to return complex types (e.g., map and array) as a JSON string

* This contains additional changes to the original patch which was merged.

+ Renamed "flatten" to "complex-to-json"

+ With the new patch, we return VARCHAR instead of VARBINARY.

+ Added test case.

+ Minor code re-factoring.

    • -0
    • +42
  1. … 35 more files in changeset.
DRILL-836: Drill needs to return complex types (e.g., map and array) as a JSON string

    • -0
    • +42
  1. … 16 more files in changeset.
DRILL-935: Run-time code generation support for function which decodes string/varbinary into complex JSON object.

  1. … 23 more files in changeset.
DRILL-884: Always return a schema, even when there are no records

  1. … 28 more files in changeset.
Bug fixes in Project operator. Use allocateNewSafe() to allocate space for outgoing batch in Project.

Reenable testcase.

  1. … 1 more file in changeset.
DRILL-865: Interface changes to AbstractRecordBatch to enable easy collection of stats. Add BaseRootExec as a wrapper to collect stats for senders.

  1. … 17 more files in changeset.
DRILL-789: Left outer join returns "null" values for columns from the right table

The problem is that the "lastSet" field is not set on the Nullable mutator when the vector is loaded off the wire. The fix here is to make sure this gets set.

At the same time, this wouldn't be a problem, except for the fact that we fill in empty values when setValueCount is called, and if "lastSet" is not correct, we end up clobbering the existing data. We shouldn't need to call setValueCount on a transferred vector, so I am making this change in project record batch.

  1. … 2 more files in changeset.
Fix for project column ordering is wrong.

  1. … 44 more files in changeset.