Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-2245: Clean up query setup and execution kickoff in Foreman/WorkManager in order to ensure consistent handling, and avoid hangs and races, with the goal of improving Drillbit robustness.

I did my best to keep these clean when I split them up, but this core commit

may depend on some minor changes in the hygiene commit that is also

associated with this bug, so either both should be applied, or neither.

The core commit should be applied first.

protocol/pom.xml

- updated protocol buffer compiler version to 2.6

- this made slight modifications to the formats of a few committed protobuf

files

AutoCloseables

- created org.apache.drill.common.AutoCloseables to handle closing these

quietly

BaseTestQuery, and derivatives

- factored out pieces into QueryTestUtil so they can be reused

DeferredException:

- created this so we can collect exceptions during the shutdown process

Drillbit

- uses AutoCloseables for the WorkManager and for the storeProvider

- allow start() to take a RemoteServiceSet

- private, final, formatting

Foreman

- added new state CANCELLATION_REQUESTED (via UserBitShared.proto) to represent

the time between request of a cancellation, and acknowledgement from all

remote endpoints running fragments on a query's behalf

- created ForemanResult to manage interleaving cleanup effects/failure with

query result state

- does not need to implement Comparable

- does not need to implement Closeable

- thread blocking fixes

- add resultSent flag

- add code to log plan fragments with endpoint assignments

- added finals, cleaned up formatting

- do queue management in acquireQuerySemaphore; local tests pass

- rename getContext() to getQueryContext()

- retain DrillbitContext

- a couple of exception injections for testing

- minor formatting

- TODOs

FragmentContext

- added a DeferredException to collect errors during startup/shutdown sequences

FragmentExecutor

- eliminated CancelableQuery

- use the FragmentContext's DeferredException for errors

- common subexpression elimination

- cleaned up

QueryContext

- removed unnecessary functions (with some outside classes tweaked for this)

- finals, formatting

QueryManager

- merge in QueryStatus

- affects Foreman, ../batch/ControlHandlerImpl,

and ../../server/rest/ProfileResources

- made some methods private

- removed unused imports

- add finals and formatting

- variable renaming to improve readability

- formatting

- comments

- TODOs

QueryStatus

- getAsInfo() private

- member renaming

- member access changes

- formatting

- TODOs

QueryTestUtil, BaseTestQuery, TestDrillbitResilience

- make maxWidth a parameter to server startup

SelfCleaningRunnable

- created org.apache.drill.common.SelfCleaningRunnable

SingleRowListener

- created org.apache.drill.SingleRowListener results listener

- use in TestDrillbitResilience

TestComparisonFunctions

- fix not to close the FragmentContext multiple times

TestDrillbitResilience

- created org.apache.drill.exec.server.TestDrillbitResilience to test drillbit

resilience in the face of exceptions and failures during queries

TestWithZookeeper

- factor out work into ZookeeperHelper so that it can be reused by

TestDrillbitResilience

UserBitShared

- get rid of unused UNKNOWN_QUERY

WorkEventBus

- rename methods, affects Foreman and ControlHandlerImpl

- remove unused WorkerBee reference

- most members final

- formatting

WorkManager

- Closeable to AutoCloseable

- removed unused incomingFragments Set

- eliminated unnecessary eventThread and pendingTasks by posting Runnables

directly to executor

- use SelfCleaningRunnable for Foreman management

- FragmentExecutor management uses SelfCleaningRunnable

- runningFragments to be a ConcurrentHashMap; TestTpchDistributed passes

- other improvements due to bee no longer needed in various places

- most members final

- minor formatting

- comments

- TODOs

(*) Created exception injection classes to simulate exceptions for testing

- ExceptionInjection

- ExceptionInjector

- ExceptionInjectionUtil

- TestExceptionInjection

DRILL-2245-hygiene: General code cleanup encountered while working on the rest

of this commit. This includes

- making members final whenever possible

- making members private whenever possible

- making loggers private

- removing unused imports

- removing unused private functions

- removing unused public functions

- removing unused local variables

- removing unused private members

- deleting unused files

- cleaning up formatting

- adding spaces before braces in conditionals and loop bodies

- breaking up overly long lines

- removing extra blank lines

While I tried to keep this clean, this commit may have minor dependencies on

DRILL-2245-core that I missed. The intention is just to break this up for

review purposes. Either both commits should be applied, or neither.

  1. … 93 more files in changeset.
DRILL-1062: Implemented null ordering (NULLS FIRST/NULLS LAST).

Primary:

- Split "compare_to" function templates (for sorting) into

"compare_to_nulls_high" and "compare_to_nulls_low" versions.

- Added tests to verify ORDER BY ordering.

- Added tests to verify merge join order correctness.

- Implemented java.sql.DatabaseMetaData.nullsAreSortedHigh(), etc.

Secondary:

- Eliminated DateInterfaceFunctions.java template (merged into other).

- Renamed comparison-related template data objects and file names.

- Eliminated unused template macros, function template classes.

- Overhauled Order.Ordering; added unit test.

- Regularized some generated-class names.

Miscellaneous:

- Added toString() to ExpressionPosition, Order.Ordering, JoinStatus.

- Fixed some typos.

- Fixed some comment syntax.

  1. … 50 more files in changeset.
DRILL-1960: Automatic reallocation

    • -3
    • +1
    ./OrderedPartitionProjectorTemplate.java
  1. … 64 more files in changeset.
DRILL-1436: Remove use of UDP based cache for purposes of intermediate PlanFragment distribution

Includes:

- Remove dependency on Infinispan

- Update initialize fragments to send in batches.

- Update RPC layer to capture UserRpcExceptions and propagate back.

- Send full stack trace in DrillPBError and let foreman node decide on formatting.

- Increment control rpc version

- Update systables to report current drillbit and version

  1. … 65 more files in changeset.
DRILL-1384: Part 1 - Rebase on Calcite. Change code due to Calcite package renaming/re-structure.

Optiq changed to use DATETIME_PLUS. Have to handle it in Drill.

PushFilterPastJoinRule has some issue. Temp fix for that.

Failed unit tests:

1) TestFlatten

2) TestConvertFunctions / TestComplexTypeWriter : "Concat"

3) TPCH Q16 : CanNotPlanException

Feed a RelDataTypeSystem into planner, to support decimal with precision/scale up to 38.

Remove assertion in DrillFilterRel. Optiq/Calcite could create a TRUE AND TRUE for query like WHERE col1 in (select ...) and col2 in (select ...) .

Rebase on calcite-1.1.0-drill-test-r1. Change code due to Calcite package renaming/re-structure.

Rebase on calcite : remaing with perl script. Part 1

reverse change to jdbc test.

Renaming for rebasing calcite. Part 2

Renaming for calcite rebasing. Part 3

Renaming for calcite rebasing. Part 4

Reverse change to testcase in jdbc.

Renaming for calcite rebasing. Part 5

Renaming for calcite rebasing. Part 6

remove 1.sh

WindowRel change related.

Renaming for calcite rebase. Part 7

PreprocessLogical and AggPrelBase

Renaming for calcite rebasing. Part 8. More manual change

Rebasing Calcite. Part 9

Rebasing calcite. Part 10

Rebasing API change from Calcite.

SQL parser change, due to Calcite rebasing.

Renaming change for calcite rebasing.

Renaming package due to Calcite rebasing.

Renaming package due to Calicte Rebase.

Work in progress for calcite rebasing.

Change import package names due to Calcite rebase.

Code refactor due to Calcite rebasing.

Fix bug in DistributionTraitDef.

Resolve compiler error, due to Calcite Rebasing.

Resolve compiler error after Calcite Rebasing.

minor change.

  1. … 261 more files in changeset.
DRILL-1402: Add check-style rules for trailing space, TABs and blocks without braces

  1. … 441 more files in changeset.
DRILL-634: Cleanup/organize Java imports and trailing whitespaces from Drill code

    • -3
    • +5
    ./OrderedPartitionProjectorTemplate.java
  1. … 765 more files in changeset.
DRILL-991: Limit should terminate upstream fragments immediately upon completion

  1. … 36 more files in changeset.
Enable View persistence, Storage Plugin and System option persistence.

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java

exec/java-exec/src/test/java/org/apache/drill/exec/cache/TestCacheSerialization.java

  1. … 167 more files in changeset.
DRILL-865: Interface changes to AbstractRecordBatch to enable easy collection of stats. Add BaseRootExec as a wrapper to collect stats for senders.

    • -89
    • +84
    ./OrderedPartitionRecordBatch.java
  1. … 17 more files in changeset.
status changes

    • -84
    • +89
    ./OrderedPartitionRecordBatch.java
  1. … 76 more files in changeset.
Add support for RepeatedMapVector, MapVector and RepeatedListVector.

    • -2
    • +2
    ./OrderedPartitionProjectorTemplate.java
  1. … 135 more files in changeset.
Switch distributed cache to Infinispan Add Operator level metrics

    • -10
    • +10
    ./OrderedPartitionRecordBatch.java
  1. … 109 more files in changeset.
Move to Optiq 0.6 Also includes: -improve exception catching -move schema path parsing to Antlr -close zookeeper connection on if client created -enhance BaseTestQuery and have other query tests utilize it -Various test fixes for better memory release. still needs client allocator to be closed. -refactor DrillSqlWorker and create multiple SqlHandlers -Add PojoRecordReader and DirectPlan capabilities -Update Antlr to use same quoting rules as SQL: single quote for quoted strings, back ticks for identifiers -Move back to old Sorts until bugs are fixed -Refector SelectionVector management within Prels -Add support for NO_EXCHANGES option -Extract SchemaFactories to use Optiq's new Schema handling capabilities -Add basic handling of cancel in UserServer -Remove output requirement from Project -Add start of usercredentials to User communication

    • -21
    • +21
    ./OrderedPartitionRecordBatch.java
  1. … 181 more files in changeset.
DRILL-620: Memory consumption fixes

accounting fixes

trim buffers

switch to using setSafe and copySafe methods only

adaptive allocation

operator based allocator wip

handle OOM

Operator Context

    • -1
    • +3
    ./OrderedPartitionProjectorTemplate.java
    • -23
    • +52
    ./OrderedPartitionRecordBatch.java
  1. … 155 more files in changeset.
DRILL-335: Implement Hash Aggregation

1. Implementation of the hash aggregation execution operator - this has two main parts: the HashAggTemplate and the HashAggBatch.

2. Implementation of a hash table which is used by the hash aggregation. The hash table hash two main parts: the HashTableTemplate and the ChainedHashTable. The hash table internally uses the notion of 'BatchHolder' to keep track of all keys that can fit within one batch of 64K values. New BatchHolder objects are created as needed. Each BatchHolder has its own vector container. The HashAggregate also has a similar structure and it keeps track of the workspace variables.

(NOTE: An initial design document for the hash aggregation and hash table was already attached with Drill-335. The document has not yet been updated with the latest implementation ... will try to do that in the near future).

3. Jinfeng's changes to use workspace vectors in the generated code for aggregate functions (previously, for streaming aggregate we only needed to maintain workspace variable for 1 running group; however for hash aggregate we need to maintain it for all groups).

4. Fix for Drill-318: because of #3 above, the previous fix for Drill-318 is not valid anymore. I modified the template generation code for the aggregate functions such that they conform to the new infrastructure.

5. The original AggTemplate, AggBatch and Aggregator classes have been moved to corresponding StreamingAggTemplate, StreamingAggBatch and StreamingAggregator in order to differentiate it from hash aggregation. These appear as new files but the code there has not changed.

I have run several tests manually as part of TestHashAggr...these tests use TPC-H data and in particular a relatively large 'Orders' table. However, I have not yet packaged the tests to run as part of JUnit since the location and size of the parquet files needs to be figured out. I will continue to work on that.

  1. … 48 more files in changeset.
DRILL-387: Support using Hive simple UDFs in Drill * As part of this change FunctionDefinition (and related code) * are deleted, instead the same information available in * Function Holders are used whenever required * Freemarker/CodeModel codegen for Drill ObjectInspectors * Comparator function cleanup

  1. … 85 more files in changeset.
DRILL-346: Move constant expressions to setup.

  1. … 18 more files in changeset.
DRILL-450: Add exchange rules, move from BasicOptimizer to Optiq

  1. … 126 more files in changeset.
DRILL-257: Move SQL parsing to server side. Switch to Avatica based JDBC driver. Update QuerySubmitter to support SQL queries. Update SqlAccesors to support getObject() Remove ref, clean up SQL packages some. Various performance fixes. Updating result set so first set of results must be returned before control is return to client to allow metadata to populate for aggressive tools like sqlline Move timeout functionality to TestTools. Update Expression materializer so that it will return a nullable int if a field is not found. Update Project record batch to support simple wildcard queries. Updates to move JSON record reader test to expecting VarCharVector.getObject to return a String rather than a byte[].

    • -11
    • +12
    ./OrderedPartitionRecordBatch.java
  1. … 310 more files in changeset.
DRILL-352: Extract ClassGenerator from CodeGenerator. Use hierarchal set of ClassGenerators to support implementation of an arbitrarily complex inner class hierarchy. Introduce @RuntimeOveridden annotation to allow overrides of inner classes without requiring abstract inner classes. Update operators to use methods. Add test for inner class generation. Remove support for JDK based compilation, focusing entirely on Janino based compilation.

    • -25
    • +51
    ./OrderedPartitionRecordBatch.java
  1. … 34 more files in changeset.
DRILL-334: Subdivide Drillbit control and data messages. Add support for socket backpressure. Add TopLevel and Child memory allocator with debug mode to capture memory leaks. Various memory leak fixes to get build to complete.

Also includes fixes from reviews by Tim.

    • -180
    • +249
    ./OrderedPartitionRecordBatch.java
  1. … 212 more files in changeset.
DRILL-259: Implicit cast functionality

Includes following changes:

Implicit cast : WIP.

Implicit cast: work in progress 2.

implicit_cast: workin in progress 3.

implicit cast : work in progress. Prototype works.

Implicit cast: add unit testcase. Handle null vs non-null type.

Implicit cast: add unit test. change comparision operator's allsame from true to false.

Drill-259: implicit cast. Rebase on Drill-316 explicit cast branch.

DRILL-259: implicit cast . reverse change to codegenerator.

DRILL-259: implicit cast. code clean.

DRILL-259: code clean.

Drill-259: add apache license to 7 new java files. reverse the change to pom.xml

Drill-259: minor change to test case.

Drill-259: implicit cast - process NullExpression. Convert NullExpression into TypedNullConstant.

Drill-259: implicit cast. Revise according to review comments.

  1. … 33 more files in changeset.
DRILL-311: Replace OrderedPartitionBatchCreator with OrderedPartitionSenderCreator

    • -39
    • +0
    ./OrderedPartitionBatchCreator.java
    • -0
    • +47
    ./OrderedPartitionSenderCreator.java
  1. … 2 more files in changeset.
DRILL-271 Address code review comments. VectorAccessibleSerializable now take WritableBatch. Release and reconstruct code now part of Writablebatch class.

  1. … 5 more files in changeset.
DRILL-271 refactor DistributedCache code. Uses hazel cast 3.1 and custom serialization.

    • -10
    • +10
    ./OrderedPartitionRecordBatch.java
  1. … 12 more files in changeset.
DRILL-271 retool VectorContainerSerializable to work with containers and batches. also modify DrillSerializable to work with InputStream, OutputStream

  1. … 4 more files in changeset.
DRILL-254: Add iterator validator and correct interface violations

  1. … 11 more files in changeset.
DRILL-230: Addressing comments in code review, abstract out references to HazelCache and add comments

    • -1
    • +2
    ./OrderedPartitionProjectorTemplate.java
    • -94
    • +148
    ./OrderedPartitionRecordBatch.java
  1. … 32 more files in changeset.
DRILL-230: Build a sampling range partitioner

    • -0
    • +39
    ./OrderedPartitionBatchCreator.java
    • -0
    • +40
    ./OrderedPartitionProjector.java
    • -0
    • +101
    ./OrderedPartitionProjectorTemplate.java
    • -0
    • +403
    ./OrderedPartitionRecordBatch.java
    • -0
    • +63
    ./SampleCopierTemplate.java
    • -0
    • +61
    ./SampleSortTemplate.java
    • -0
    • +101
    ./SortContainerBuilder.java
  1. … 37 more files in changeset.