drill

Clone Tools
  • last updated 23 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-2327: Raise the max value allowed for join's cardinality estimate factor to allow for hugely expanding joins.

DRILL-2262: selecting columns of certain datatypes from a dictionary encoded parquet file created by drill fails

DRILL-2328: For concat function, null input is treated as empty string; for concat operator (i.e., ||), if any input is null, the output is null

    • -1
    • +1
    /_docs/dev-custom-fcn/001-dev-simple.md
  1. … 43 more files in changeset.
DRILL-2253: PART-2 Vectorized Parquet reader fails to read correctly against RLE Dictionary encoded DATE column

DRILL-2289 - User email is still pointing to the old ( incubator.apache.org) should be user@drill.apache.org.

DRILL-2321: FlattenRecordBatch should transfer vectors honoring output field reference.

DRILL-2266: Complex parquet reader fails on reading timestamp datatype

DRILL-2267: bumping parquet version to 1.5.1-r7

DRILL-2316: Add hive, parquet, json ref docs, basics tutorial, and minor edits

    • -0
    • +27
    /_docs/009-datasources.md
    • -0
    • +37
    /_docs/010-dev-custom-func.md
    • -0
    • +14
    /_docs/011-manage.md
    • -0
    • +9
    /_docs/014-contribute.md
    • -0
    • +10
    /_docs/015-sample-ds.md
    • -0
    • +13
    /_docs/016-design.md
    • -0
    • +170
    /_docs/019-bylaws.md
    • -0
    • +188
    /_docs/data-sources/001-hive-types.md
    • -0
    • +39
    /_docs/data-sources/002-hive-udf.md
    • -0
    • +287
    /_docs/data-sources/003-parquet-ref.md
  1. … 34 more files in changeset.
DRILL-1385, along with some cleanup

Cleaned up option handling. This includes using finals, making member variables

private whenever possible, and some formatting.

- fixed a bug in the string formatting for the double range validator

- OptionValidator, OptionValue, and their implementations now conspire not to

allow the creation of malformed options because the OptionType has been added

to validator calls to handle OptionValues that are created on demand.

Started with updated byte code rewrite from Jacques

Fixed several problems with scalar value replacement:

- use consistent ASM api version throughout

- stop using deprecated ASM methods (actually causes bugs)

- visitMethodInsn()

- added a couple of missing super.visitEnd()s

- fixed a couple of minor FindBugs issues

- accounted for required stack size increases when replacing holders for

longs and doubles

- added accounting for frame offsets to cope with long and double local

variables and value holder members

- fixed a few minor bugs found with FindBugs

- stop using carrotlabs' hash map lget() method on shared constant data

- fixed an incorrect use of DUP2 on objectrefs when copying long or double

holder members into locals

- fixed a problem with redundant POP instructions left behind after replacement

- fixed a problem with incorrect DUPs in multiple assignment statements

- fixed a problem with DUP_X1 replacement when handling constants in multiple

assignment statements

- fixed a problem with non-replaced holder member post-decrements

- don't replace holders passed to static functions as "out" parameters

(common with Accessors on repeated value vectors)

- increased the maximum required stack size when transferring holder members to

locals

- changed the code generation block type mappings for constants for external

sorts

- fixed problems handling constant and non-constant member variables in

operator classes

- in general, if a holder is assigned to or from an operator member variable,

it can't be replaced (at least not until we replace those as well)

- Use a derived ASM Analyzer (MethodAnalyzer) and Frame

(AssignmentTrackingFrame) in order to establish relationships between

assignments of holders through chains of local variables. This effectively

back-propagates non-replaceability attributes so that if a holder variable

that can't be replaced is assigned to from another holder variable, that

second one cannot be replaced either, and so on through longer chains of

assignments.

- code for dumping generated source code

- MergeAdapter dumps before and after results of scalar replacement

(if it's on)

- fixed some problems in ReplacingBasicValue by replacing HashSet with

IdentityHashMap

- made loggers private

- added a retry strategy for scalar replacement

if a scalar replacement code rewriting fails, then this will try to

regenerate the bytecode again without the scalar replacement.

- bytecode verification is always on now (required for the retry strategy)

- use system option to determine whether scalar replacement should be used

- default option: if scalar replacement fails, retry without it

- force replacement on or off

- unit tests for the retry strategy are based on a single known failure case,

covered by DRILL-2326.

- add tests TestConvertFunctions to test the three scalar replacement options

for the failing test case (testVarCharReturnTripConvertLogical)

- made it possible to set a SYSTEM option as a java property in Drillbit

- added a command line argument to force scalar replacement to be on during

testing in the rootmost pom.xml

In the course of this, added increased checking of intermediate stages of code

rewriting, as well as logging of classes that cause failures.

- work around a bug in ASM's CheckClassAdapter that doesn't allow for checking

of inner classes

Added comments, tidied up formatting, and added "final" in a number of places.

Signed-off-by: vkorukanti <venki.korukanti@gmail.com>

  1. … 39 more files in changeset.
DRILL-2245: Clean up query setup and execution kickoff in Foreman/WorkManager in order to ensure consistent handling, and avoid hangs and races, with the goal of improving Drillbit robustness.

I did my best to keep these clean when I split them up, but this core commit

may depend on some minor changes in the hygiene commit that is also

associated with this bug, so either both should be applied, or neither.

The core commit should be applied first.

protocol/pom.xml

- updated protocol buffer compiler version to 2.6

- this made slight modifications to the formats of a few committed protobuf

files

AutoCloseables

- created org.apache.drill.common.AutoCloseables to handle closing these

quietly

BaseTestQuery, and derivatives

- factored out pieces into QueryTestUtil so they can be reused

DeferredException:

- created this so we can collect exceptions during the shutdown process

Drillbit

- uses AutoCloseables for the WorkManager and for the storeProvider

- allow start() to take a RemoteServiceSet

- private, final, formatting

Foreman

- added new state CANCELLATION_REQUESTED (via UserBitShared.proto) to represent

the time between request of a cancellation, and acknowledgement from all

remote endpoints running fragments on a query's behalf

- created ForemanResult to manage interleaving cleanup effects/failure with

query result state

- does not need to implement Comparable

- does not need to implement Closeable

- thread blocking fixes

- add resultSent flag

- add code to log plan fragments with endpoint assignments

- added finals, cleaned up formatting

- do queue management in acquireQuerySemaphore; local tests pass

- rename getContext() to getQueryContext()

- retain DrillbitContext

- a couple of exception injections for testing

- minor formatting

- TODOs

FragmentContext

- added a DeferredException to collect errors during startup/shutdown sequences

FragmentExecutor

- eliminated CancelableQuery

- use the FragmentContext's DeferredException for errors

- common subexpression elimination

- cleaned up

QueryContext

- removed unnecessary functions (with some outside classes tweaked for this)

- finals, formatting

QueryManager

- merge in QueryStatus

- affects Foreman, ../batch/ControlHandlerImpl,

and ../../server/rest/ProfileResources

- made some methods private

- removed unused imports

- add finals and formatting

- variable renaming to improve readability

- formatting

- comments

- TODOs

QueryStatus

- getAsInfo() private

- member renaming

- member access changes

- formatting

- TODOs

QueryTestUtil, BaseTestQuery, TestDrillbitResilience

- make maxWidth a parameter to server startup

SelfCleaningRunnable

- created org.apache.drill.common.SelfCleaningRunnable

SingleRowListener

- created org.apache.drill.SingleRowListener results listener

- use in TestDrillbitResilience

TestComparisonFunctions

- fix not to close the FragmentContext multiple times

TestDrillbitResilience

- created org.apache.drill.exec.server.TestDrillbitResilience to test drillbit

resilience in the face of exceptions and failures during queries

TestWithZookeeper

- factor out work into ZookeeperHelper so that it can be reused by

TestDrillbitResilience

UserBitShared

- get rid of unused UNKNOWN_QUERY

WorkEventBus

- rename methods, affects Foreman and ControlHandlerImpl

- remove unused WorkerBee reference

- most members final

- formatting

WorkManager

- Closeable to AutoCloseable

- removed unused incomingFragments Set

- eliminated unnecessary eventThread and pendingTasks by posting Runnables

directly to executor

- use SelfCleaningRunnable for Foreman management

- FragmentExecutor management uses SelfCleaningRunnable

- runningFragments to be a ConcurrentHashMap; TestTpchDistributed passes

- other improvements due to bee no longer needed in various places

- most members final

- minor formatting

- comments

- TODOs

(*) Created exception injection classes to simulate exceptions for testing

- ExceptionInjection

- ExceptionInjector

- ExceptionInjectionUtil

- TestExceptionInjection

DRILL-2245-hygiene: General code cleanup encountered while working on the rest

of this commit. This includes

- making members final whenever possible

- making members private whenever possible

- making loggers private

- removing unused imports

- removing unused private functions

- removing unused public functions

- removing unused local variables

- removing unused private members

- deleting unused files

- cleaning up formatting

- adding spaces before braces in conditionals and loop bodies

- breaking up overly long lines

- removing extra blank lines

While I tried to keep this clean, this commit may have minor dependencies on

DRILL-2245-core that I missed. The intention is just to break this up for

review purposes. Either both commits should be applied, or neither.

  1. … 79 more files in changeset.
DRILL-2294: Prevent collecting intermediate stats before the operator tree was finished being constructed.

DRILL-1378: Ctrl-C to cancel a query that has not returned with the first result set.

DRILL-1690: Issue with using HBase plugin to access row_key only

DRILL-1325: Throw UnsupportedRelOperatorException for unequal joins, implicit cross joins

DRILL-2307: Detect DNS name resolution failure for better error messages

MD-131: Queries fail when using maprdb plugin

DRILL-2280: Refactor ValueVector interface & add an abstract ValueVector implementation

  1. … 10 more files in changeset.
DRILL-2130: Fixed JUnit/Hamcrest/Mockito/Paranamer class path problem.

DRILL-2283: Fixed Java VARCHAR(1) in INFO._SCHEMA that caused bad comparisons.

DRILL-1733 - Include Hadoop winutils in Drill distribution

    • -46
    • +62
    /distribution/src/assemble/bin.xml
    • -0
    • +1
    /distribution/src/resources/sqlline.bat
DRILL-1496: Fix serialization of 'similar to' while converting from optiq expression to drill.

DRILL-2400: Part 1: Change cpu cost estimation formula for DrillFilterRelBase. Add testcase for MergeFilter rule.

Modify costing of Filter.

Cost change to Filter.

Change to filter costing.

Move one test utility method to PlanTestBase.

DRILL-1757 Support for wildcards in repeated_contains()

DRILL-2197: Fix no applicable constructor error in outer join with a map type

    • -0
    • +39
    /exec/java-exec/src/test/resources/join/complex_1.json
    • -0
    • +19
    /exec/java-exec/src/test/resources/join/complex_2.json
DRILL-2279: Raise exception if schema change is encountered in hash and streaming aggregate

DRILL-2270: Broken link on Apache Drill website

DRILL-2269: Add default implementation for estimating cost of evaluating an expression, in stead of throwing Exception.

Set default cost of evaluating a HiveFuncHolder expression.

DRILL-2253: Vectorized Parquet reader fails to read correctly against RLE Dictionary encoded DATE column