Clone Tools
  • last updated 22 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7583: Remove STOP status from operator outcome

Now that all operators have been converted to throw

exceptions on error condistions, the STOP status is

unused. This patch removes the STOP status and the

related kill() and killIncoming() methods. The

"kill" methods are replaced by "cancel" methods which

handle "normal" case cancellation, such as for

LIMIT.

closes #1981

    • -6
    • +4
    ./OrderedPartitionProjectorTemplate.java
  1. … 75 more files in changeset.
DRILL-7576: Fail fast for operator errors

Converts operators to fail with a UserException rather than using

the STOP iterator status. The result is clearer error messages

and simpler code.

closes #1975

    • -22
    • +15
    ./OrderedPartitionRecordBatch.java
  1. … 66 more files in changeset.
DRILL-7507: Convert fragment interrupts to exceptions

Modifies fragment interrupt handling to throw a specialized

exception, rather than relying on the complex and cumbersome

STOP iterator status.

closes #1949

  1. … 15 more files in changeset.
DRILL-7506: Simplify code gen error handling

Pushes code gen error handling close to the code gen itself to

allow clearer error messages. Doing so avoids the need to bubble

code gen exceptions up the call stack, resulting in cleaner

operator code.

closes #1948

    • -90
    • +59
    ./OrderedPartitionRecordBatch.java
  1. … 40 more files in changeset.
DRILL-7456: Batch count fixes for 12 operators

Enables batch validation for 12 additional operators:

* MergingRecordBatch

* OrderedPartitionRecordBatch

* RangePartitionRecordBatch

* TraceRecordBatch

* UnionAllRecordBatch

* UnorderedReceiverBatch

* UnpivotMapsRecordBatch

* WindowFrameRecordBatch

* TopNBatch

* HashJoinBatch

* ExternalSortBatch

* WriterRecordBatch

Fixes issues found with those checks so that this set of

operators passes all checks.

Includes code cleanup in many files touched during this

work.

closes #1906

    • -65
    • +68
    ./OrderedPartitionRecordBatch.java
  1. … 46 more files in changeset.
DRILL-7096: Develop vector for canonical Map<K,V>

- Added new type DICT;

- Created value vectors for the type for single and repeated modes;

- Implemented corresponding FieldReaders and FieldWriters;

- Made changes in EvaluationVisitor to be able to read values from the map by key;

- Made changes to DrillParquetGroupConverter to be able to read Parquet's MAP type;

- Added an option `store.parquet.reader.enable_map_support` to disable reading MAP type as DICT from Parquet files;

- Updated AvroRecordReader to use new DICT type for Avro's MAP;

- Added support of the new type to ParquetRecordWriter.

  1. … 108 more files in changeset.
DRILL-7011: Support schema in scan framework

* Adds schema support to the row set-based scan framework and to the "V3" text reader based on that framework.

* Adding the schema made clear that passing options as a long list of constructor arguments was not sustainable. Refactored code to use a builder pattern instead.

* Added support for default values in the "null column loader", which required adding a "setValue" method to the column accessors.

* Added unit tests for all new or changed functionality. See TestCsvWithSchema for the overall test of the entire integrated mechanism.

* Added tests for explicit projection with schema

* Better handling of date/time in column accessors

* Converted recent column metadata work from Java 8 date/time to Joda.

* Added more CSV-with-schema unit tests

* Removed the ID fields from "resolved columns", used "instanceof" instead.

* Added wildcard projection with an output schema. Handles both "lenient" and "strict" schemas.

* Tagged projection columns with their output schema, when available.

* Scan projection added modes for wildcard with an output schema. The reader projection added support for merging reader and output schemas.

* Includes refactoring of scan operator tests (the test file grew too large.)

* Renamed some classes to avoid confusing reader schemas with output schemas.

* Added unit tests for the new functionality.

* Added "lenient" wildcard with schema test for CSV

* Added more type conversions: string-to-bit, many-to-string

* Fixed bug in column writer for VarDecimal

* Added missing unit tests, and fixed bugs, in Bit column reader/writer

* Cleaned up a number of unneded "SuppressWarnings"

closes #1711

  1. … 223 more files in changeset.
DRILL-6724: Dump operator context to logs when error occurs during query execution

closes #1455

  1. … 102 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

    • -1
    • +1
    ./OrderedPartitionProjectorTemplate.java
  1. … 982 more files in changeset.
DRILL-6435: MappingSet is stateful, so it can't be shared between threads

  1. … 9 more files in changeset.
DRILL-6320: Fixed license headers.

closes #1207

  1. … 2061 more files in changeset.
DRILL-5730: Mock testing improvements and interface improvements

closes #1045

  1. … 222 more files in changeset.
DRILL-5967: Fixed memory leak in OrderedPartitionSender

closes apache/drill#1073

    • -5
    • +30
    ./OrderedPartitionSenderCreator.java
  1. … 1 more file in changeset.
DRILL-5783, DRILL-5841, DRILL-5894: Rationalize test temp directories

This change includes:

DRILL-5783:

- A unit test is created for the priority queue in the TopN operator.

- The code generation classes passed around a completely unused function registry reference in some places so it is removed.

- The priority queue had unused parameters for some of its methods so it is removed.

DRILL-5841:

- Created standardized temp directory classes DirTestWatcher, SubDirTestWatcher, and BaseDirTestWatcher. And updated all unit tests to use them.

DRILL-5894:

- Removed the dfs_test storage plugin for tests and replaced it with the already existing dfs storage plugin.

Misc:

- General code cleanup.

- Removed unnecessary use of String.format in the tests.

This closes #984

  1. … 365 more files in changeset.
DRILL-4264: Allow field names to include dots

  1. … 98 more files in changeset.
DRILL-5116: Enable generated code debugging in each Drill operator

DRILL-5052 added the ability to debug generated code. The reviewer suggested

permitting the technique to be used for all Drill operators. This PR provides

the required fixes. Most were small changes, others dealt with the rather

clever way that the existing byte-code merge converted static nested classes

to non-static inner classes, with the way that constructors were inserted

at the byte-code level and so on. See the JIRA for the details.

This code passed the unit tests twice: once with the traditional byte-code

manipulations, a second time using "plain-old Java" code compilation.

Plain-old Java is turned off by default, but can be turned on for all

operators with a single config change: see the JIRA for info. Consider

the plain-old Java option to be experimental: very handy for debugging,

perhaps not quite tested enough for production use.

close apache/drill#716

    • -18
    • +20
    ./OrderedPartitionProjectorTemplate.java
  1. … 60 more files in changeset.
DRILL-4715: Fix java compilation error in run-time generated code when query has large number of expressions.

Refactor unit test in drillbit context initialization and pass in option manager.

close apache/drill#521

  1. … 53 more files in changeset.
DRILL-4382: Remove dependency on drill-logical from vector package

  1. … 80 more files in changeset.
DRILL-4215: Transfer buffer ownership in TransferPair

  1. … 44 more files in changeset.
DRILL-3987: (REFACTOR) Common and Vector modules building.

- Extract Accountor interface from Implementation

- Separate FMPP modules to separate out Vector Needs versus external needs

- Separate out Vector classes from those that are VectorAccessible.

- Cleanup Memory Exception hiearchy

  1. … 105 more files in changeset.
DRILL-2288: Fix ScanBatch violation of IterOutcome protocol and downstream chain of bugs.

Increments:

2288: Pt. 1 Core: Added unit test. [Drill2288GetColumnsMetadataWhenNoRowsTest, empty.json]

2288: Pt. 1 Core: Changed HBase test table #1's # of regions from 1 to 2. [HBaseTestsSuite]

Also added TODO(DRILL-3954) comment about # of regions.

2288: Pt. 2 Core: Documented IterOutcome much more clearly. [RecordBatch]

Also edited some related Javadoc.

2288: Pt. 2 Hyg.: Edited doc., added @Override, etc. [AbstractRecordBatch, RecordBatch]

Purged unused SetupOutcome.

Added @Override.

Edited comments.

Fix some comments to doc. comments.

2288: Pt. 3 Core&Hyg.: Added validation of IterOutcome sequence. [IteratorValidatorBatchIterator]

Also:

Renamed internal members for clarity.

Added comments.

2288: Pt. 4 Core: Fixed a NONE -> OK_NEW_SCHEMA in ScanBatch.next(). [ScanBatch]

(With nearby comments.)

2288: Pt. 4 Hyg.: Edited comments, reordered, whitespace. [ScanBatch]

Reordered

Added comments.

Aligned.

2288: Pt. 4 Core+: Fixed UnionAllRecordBatch to receive IterOutcome sequence right. (3659) [UnionAllRecordBatch]

2288: Pt. 5 Core: Fixed ScanBatch.Mutator.isNewSchema() to stop spurious "new schema" reports (fix short-circuit OR, to call resetting method right). [ScanBatch]

2288: Pt. 5 Hyg.: Renamed, edited comments, reordered. [ScanBatch, SchemaChangeCallBack, AbstractSingleRecordBatch]

Renamed getSchemaChange -> getSchemaChangedAndReset.

Renamed schemaChange -> schemaChanged.

Added doc. comments.

Aligned.

2288: Pt. 6 Core: Avoided dummy Null.IntVec. column in JsonReader when not needed (MapWriter.isEmptyMap()). [JsonReader, 3 vector files]

2288: Pt. 6 Hyg.: Edited comments, message. Fixed message formatting. [RecordReader, JSONFormatPlugin, JSONRecordReader, AbstractMapVector, JsonReader]

Fixed message formatting.

Edited comments.

Edited message.

Fixed spurious line break.

2288: Pt. 7 Core: Added column families in HBaseRecordReader* to avoid dummy Null.IntVec. clash. [HBaseRecordReader]

2288: Pt. 8 Core.1: Cleared recordCount in OrderedPartitionRecordBatch.innerNext(). [OrderedPartitionRecordBatch]

2288: Pt. 8 Core.2: Cleared recordCount in ProjectRecordBatch.innerNext. [ProjectRecordBatch]

2288: Pt. 8 Core.3: Cleared recordCount in TopNBatch.innerNext. [TopNBatch]

2288: Pt. 9 Core: Had UnorderedReceiverBatch reset RecordBatchLoader's record count. [UnorderedReceiverBatch, RecordBatchLoader]

2288: Pt. 9 Hyg.: Added comments. [RecordBatchLoader]

2288: Pt. 10 Core: Worked around mismatched map child vectors in MapVector.getObject(). [MapVector]

2288: Pt. 11 Core: Added OK_NEW_SCHEMA schema comparison for HashAgg. [HashAggTemplate]

2288: Pt. 12 Core: Fixed memory leak in BaseTestQuery's printing.

Fixed bad skipping of RecordBatchLoader.clear(...) and

QueryDataBatch.load(...) for zero-row batches in printResult(...).

Also, dropped suppression of call to

VectorUtil.showVectorAccessibleContent(...) (so zero-row batches are

as visible as others).

2288: Pt. 13 Core: Fixed test that used unhandled periods in column alias identifiers.

2288: Misc.: Added # of rows to showVectorAccessibleContent's output. [VectorUtil]

2288: Misc.: Added simple/partial toString() [VectorContainer, AbstractRecordReader, JSONRecordReader, BaseValueVector, FieldSelection, AbstractBaseWriter]

2288: Misc. Hyg.: Added doc. comments to VectorContainer. [VectorContainer]

2288: Misc. Hyg.: Edited comment. [DrillStringUtils]

2288: Misc. Hyg.: Clarified message for unhandled identifier containing period.

2288: Pt. 3 Core&Hyg. Upd.: Added schema comparison result to logging. [IteratorValidatorBatchIterator]

2288: Pt. 7 Core Upd.: Handled HBase columns too re NullableIntVectors. [HBaseRecordReader, TestTableGenerator, TestHBaseFilterPushDown]

Created map-child vectors for requested columns.

Added unit test method testDummyColumnsAreAvoided, adding new row to test table,

updated some row counts.

2288: Pt. 7 Hyg. Upd.: Edited comment. [HBaseRecordReader]

2288: Pt. 11 Core Upd.: REVERTED all of bad OK_NEW_SCHEMA schema comparison for HashAgg. [HashAggTemplate]

This reverts commit 0939660f4620c03da97f4e1bf25a27514e6d0b81.

2288: Pt. 6 Core Upd.: Added isEmptyMap override in new (just-rebased-in) PromotableWriter. [PromotableWriter]

Adjusted definition and default implementation of isEmptyMap (to handle MongoDB

storage plugin's use of JsonReader).

2288: Pt. 6 Hyg. Upd.: Purged old atLeastOneWrite flag. [JsonReader]

2288: Pt. 14: Disabled newly dying test testNestedFlatten().

  1. … 39 more files in changeset.
DRILL-3555: Changing defaults for planner.memory.max_query_memory_per_node causes queries with window function to fail

this closes #137

  1. … 4 more files in changeset.
DRILL-1942-hygiene: - add AutoCloseable to many classes - minor fixes - formatting

this closes #133

  1. … 30 more files in changeset.
DRILL-3033: Add memory leak fixes found so far in DRILL-1942 to 1.0.0 Fixes some missing buffer retains() and missing vector clears().

    • -105
    • +121
    ./OrderedPartitionRecordBatch.java
  1. … 20 more files in changeset.
DRILL-2755: Use and handle InterruptedException during query processing.

- Interrupt FragmentExecutor thread as part of FragmentExecutor.cancel()

- Handle InterruptedException in ExternalSortBatch.newSV2(). If the fragment status says

should not continue, then throw the InterruptedException to caller which returns IterOutcome.STOP

- Add comments reg not handling of InterruptedException in SendingAccountor.waitForSendComplete()

- Handle InterruptedException in OrderedPartitionRecordBatch.getPartitionVectors()

If interrupted in Thread.sleep calls and fragment status says should not run, then

return IterOutcome.STOP downstream.

- Interrupt partitioner threads if PartitionerRecordBatch is interrupted while waiting for

partitioner threads to complete.

- Preserve interrupt status if not handled

- Handle null RecordBatches returned by RawBatchBuffer.getNext() in MergingRecordBatch.buildSchema()

- Change timeout in Foreman to be proportional to the number of intermediate fragments sent instead

of hard coded limit of 90s.

- Change TimedRunnable to enforce a timeout of 15s per runnable.

Total timeout is (5s * numOfRunnableTasks) / parallelism.

- Add unit tests

* Testing cancelling a query interrupts the query fragments which are currently blocked

* Testing interrupting the partitioner sender which in turn interrupts its helper threads

* Testing TimedRunanble enforeces timeout for the whole task list.

  1. … 28 more files in changeset.
DRILL-2826: Simplify and centralize Operator Cleanup

- Remove cleanup method from RecordBatch interface

- Make OperatorContext creation and closing the management of FragmentContext

- Make OperatorContext an abstract class and the impl only available to FragmentContext

- Make RecordBatch closing the responsibility of the RootExec

- Make all closes be suppresing closes to maximize memory release in failure

- Add new CloseableRecordBatch interface used by RootExec

- Make RootExec AutoCloseable

- Update RecordBatchCreator to return CloseableRecordBatches so that RootExec can maintain list

- Generate list of operators through change in ImplCreator

  1. … 95 more files in changeset.
DRILL-2245: Clean up query setup and execution kickoff in Foreman/WorkManager in order to ensure consistent handling, and avoid hangs and races, with the goal of improving Drillbit robustness.

I did my best to keep these clean when I split them up, but this core commit

may depend on some minor changes in the hygiene commit that is also

associated with this bug, so either both should be applied, or neither.

The core commit should be applied first.

protocol/pom.xml

- updated protocol buffer compiler version to 2.6

- this made slight modifications to the formats of a few committed protobuf

files

AutoCloseables

- created org.apache.drill.common.AutoCloseables to handle closing these

quietly

BaseTestQuery, and derivatives

- factored out pieces into QueryTestUtil so they can be reused

DeferredException:

- created this so we can collect exceptions during the shutdown process

Drillbit

- uses AutoCloseables for the WorkManager and for the storeProvider

- allow start() to take a RemoteServiceSet

- private, final, formatting

Foreman

- added new state CANCELLATION_REQUESTED (via UserBitShared.proto) to represent

the time between request of a cancellation, and acknowledgement from all

remote endpoints running fragments on a query's behalf

- created ForemanResult to manage interleaving cleanup effects/failure with

query result state

- does not need to implement Comparable

- does not need to implement Closeable

- thread blocking fixes

- add resultSent flag

- add code to log plan fragments with endpoint assignments

- added finals, cleaned up formatting

- do queue management in acquireQuerySemaphore; local tests pass

- rename getContext() to getQueryContext()

- retain DrillbitContext

- a couple of exception injections for testing

- minor formatting

- TODOs

FragmentContext

- added a DeferredException to collect errors during startup/shutdown sequences

FragmentExecutor

- eliminated CancelableQuery

- use the FragmentContext's DeferredException for errors

- common subexpression elimination

- cleaned up

QueryContext

- removed unnecessary functions (with some outside classes tweaked for this)

- finals, formatting

QueryManager

- merge in QueryStatus

- affects Foreman, ../batch/ControlHandlerImpl,

and ../../server/rest/ProfileResources

- made some methods private

- removed unused imports

- add finals and formatting

- variable renaming to improve readability

- formatting

- comments

- TODOs

QueryStatus

- getAsInfo() private

- member renaming

- member access changes

- formatting

- TODOs

QueryTestUtil, BaseTestQuery, TestDrillbitResilience

- make maxWidth a parameter to server startup

SelfCleaningRunnable

- created org.apache.drill.common.SelfCleaningRunnable

SingleRowListener

- created org.apache.drill.SingleRowListener results listener

- use in TestDrillbitResilience

TestComparisonFunctions

- fix not to close the FragmentContext multiple times

TestDrillbitResilience

- created org.apache.drill.exec.server.TestDrillbitResilience to test drillbit

resilience in the face of exceptions and failures during queries

TestWithZookeeper

- factor out work into ZookeeperHelper so that it can be reused by

TestDrillbitResilience

UserBitShared

- get rid of unused UNKNOWN_QUERY

WorkEventBus

- rename methods, affects Foreman and ControlHandlerImpl

- remove unused WorkerBee reference

- most members final

- formatting

WorkManager

- Closeable to AutoCloseable

- removed unused incomingFragments Set

- eliminated unnecessary eventThread and pendingTasks by posting Runnables

directly to executor

- use SelfCleaningRunnable for Foreman management

- FragmentExecutor management uses SelfCleaningRunnable

- runningFragments to be a ConcurrentHashMap; TestTpchDistributed passes

- other improvements due to bee no longer needed in various places

- most members final

- minor formatting

- comments

- TODOs

(*) Created exception injection classes to simulate exceptions for testing

- ExceptionInjection

- ExceptionInjector

- ExceptionInjectionUtil

- TestExceptionInjection

DRILL-2245-hygiene: General code cleanup encountered while working on the rest

of this commit. This includes

- making members final whenever possible

- making members private whenever possible

- making loggers private

- removing unused imports

- removing unused private functions

- removing unused public functions

- removing unused local variables

- removing unused private members

- deleting unused files

- cleaning up formatting

- adding spaces before braces in conditionals and loop bodies

- breaking up overly long lines

- removing extra blank lines

While I tried to keep this clean, this commit may have minor dependencies on

DRILL-2245-core that I missed. The intention is just to break this up for

review purposes. Either both commits should be applied, or neither.

  1. … 93 more files in changeset.
DRILL-1062: Implemented null ordering (NULLS FIRST/NULLS LAST).

Primary:

- Split "compare_to" function templates (for sorting) into

"compare_to_nulls_high" and "compare_to_nulls_low" versions.

- Added tests to verify ORDER BY ordering.

- Added tests to verify merge join order correctness.

- Implemented java.sql.DatabaseMetaData.nullsAreSortedHigh(), etc.

Secondary:

- Eliminated DateInterfaceFunctions.java template (merged into other).

- Renamed comparison-related template data objects and file names.

- Eliminated unused template macros, function template classes.

- Overhauled Order.Ordering; added unit test.

- Regularized some generated-class names.

Miscellaneous:

- Added toString() to ExpressionPosition, Order.Ordering, JoinStatus.

- Fixed some typos.

- Fixed some comment syntax.

  1. … 50 more files in changeset.
DRILL-1960: Automatic reallocation

    • -3
    • +1
    ./OrderedPartitionProjectorTemplate.java
  1. … 64 more files in changeset.
DRILL-1436: Remove use of UDP based cache for purposes of intermediate PlanFragment distribution

Includes:

- Remove dependency on Infinispan

- Update initialize fragments to send in batches.

- Update RPC layer to capture UserRpcExceptions and propagate back.

- Send full stack trace in DrillPBError and let foreman node decide on formatting.

- Increment control rpc version

- Update systables to report current drillbit and version

  1. … 65 more files in changeset.