Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7507: Convert fragment interrupts to exceptions

Modifies fragment interrupt handling to throw a specialized

exception, rather than relying on the complex and cumbersome

STOP iterator status.

closes #1949

  1. … 15 more files in changeset.
DRILL-7506: Simplify code gen error handling

Pushes code gen error handling close to the code gen itself to

allow clearer error messages. Doing so avoids the need to bubble

code gen exceptions up the call stack, resulting in cleaner

operator code.

closes #1948

  1. … 40 more files in changeset.
DRILL-7487: Removes the unused OUT_OF_MEMORY iterator status

See JIRA ticket for full explanation.

closes #1930

  1. … 42 more files in changeset.
DRILL-7456: Batch count fixes for 12 operators

Enables batch validation for 12 additional operators:

* MergingRecordBatch

* OrderedPartitionRecordBatch

* RangePartitionRecordBatch

* TraceRecordBatch

* UnionAllRecordBatch

* UnorderedReceiverBatch

* UnpivotMapsRecordBatch

* WindowFrameRecordBatch

* TopNBatch

* HashJoinBatch

* ExternalSortBatch

* WriterRecordBatch

Fixes issues found with those checks so that this set of

operators passes all checks.

Includes code cleanup in many files touched during this

work.

closes #1906

  1. … 46 more files in changeset.
DRILL-7011: Support schema in scan framework

* Adds schema support to the row set-based scan framework and to the "V3" text reader based on that framework.

* Adding the schema made clear that passing options as a long list of constructor arguments was not sustainable. Refactored code to use a builder pattern instead.

* Added support for default values in the "null column loader", which required adding a "setValue" method to the column accessors.

* Added unit tests for all new or changed functionality. See TestCsvWithSchema for the overall test of the entire integrated mechanism.

* Added tests for explicit projection with schema

* Better handling of date/time in column accessors

* Converted recent column metadata work from Java 8 date/time to Joda.

* Added more CSV-with-schema unit tests

* Removed the ID fields from "resolved columns", used "instanceof" instead.

* Added wildcard projection with an output schema. Handles both "lenient" and "strict" schemas.

* Tagged projection columns with their output schema, when available.

* Scan projection added modes for wildcard with an output schema. The reader projection added support for merging reader and output schemas.

* Includes refactoring of scan operator tests (the test file grew too large.)

* Renamed some classes to avoid confusing reader schemas with output schemas.

* Added unit tests for the new functionality.

* Added "lenient" wildcard with schema test for CSV

* Added more type conversions: string-to-bit, many-to-string

* Fixed bug in column writer for VarDecimal

* Added missing unit tests, and fixed bugs, in Bit column reader/writer

* Cleaned up a number of unneded "SuppressWarnings"

closes #1711

  1. … 224 more files in changeset.
DRILL-7068: Support memory adjustment framework for resource management with Queues. closes #1677

  1. … 37 more files in changeset.
DRILL-6381: Address code review comments (part 3).

DRILL-6381: Add missing joinControl logic for INTERSECT_DISTINCT.

- Modified HashJoin's probe phase to process INTERSECT_DISTINCT.

- NOTE: For build phase, the functionality will be same as for SemiJoin when it is added later.

DRILL-6381: Address code review comment for intersect_distinct.

DRILL-6381: Rebase on latest master and fix compilation issues.

DRILL-6381: Generate protobuf files for C++ native client.

DRILL-6381: Use shaded Guava classes. Add more comments and Javadoc.

  1. … 34 more files in changeset.
DRILL-6763: Codegen optimization of SQL functions with constant values(#1481)

closes #1481

  1. … 17 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

  1. … 982 more files in changeset.
DRILL-6656: Disallow extra semicolons and multiple statements on the same line.

closes #1415

  1. … 144 more files in changeset.
DRILL-6320: Fixed license headers.

closes #1207

  1. … 2063 more files in changeset.
DRILL-6295: PartitionerDecorator may close partitioners while CustomRunnable are active during query cancellation

This closes #1208

    • -130
    • +203
    ./PartitionerDecorator.java
  1. … 2 more files in changeset.
DRILL-6381: (Part 3) Planner and Execution implementation to support Secondary Indexes

  1. Index Planning Rules and Plan generators

    - DbScanToIndexScanRule: Top level physical planning rule that drives index planning for several relational algebra patterns.

- DbScanSortRemovalRule: Physical planning rule for index planning for Sort-based operations.

    - Plan Generators: Covering, Non-Covering and Intersect physical plan generators.

    - Support planning with functional indexes such as CAST functions.

    - Enhance PlannerSettings with several configuration options for indexes.

  2. Index Selection and Statistics

    - An IndexSelector that support cost-based index selection of covering and non-covering indexes using statistics and collation properties.

    - Costing of index intersection for comparison with single-index plans.

  3. Planning and execution operators

    - Support RangePartitioning physical operator during query planning and execution.

    - Support RowKeyJoin physical operator during query planning and execution.

    - HashTable and HashJoin changes to support RowKeyJoin and Index Intersection.

    - Enhance Materializer to keep track of subscan association with a particular rowkey join.

  4. Index Planning utilities

    - Utility classes to perform RexNode analysis, including conversion to and from SchemaPath.

    - Utility class to analyze filter condition and an input collation to determine output collation.

    - Helper classes to maintain index contexts for logical and physical planning phase.

    - IndexPlanUtils utility class for various helper methods.

  5. Miscellaneous

    - Separate physical rel for DirectScan.

    - Modify LimitExchangeTranspose rule to handle SingleMergeExchange.

- MD-3880: Return correct status from RangePartitionRecordBatch setupNewSchema

Co-authored-by: Aman Sinha <asinha@maprtech.com>

Co-authored-by: chunhui-shi <cshi@maprtech.com>

Co-authored-by: Gautam Parai <gparai@maprtech.com>

Co-authored-by: Padma Penumarthy <ppenumar97@yahoo.com>

Co-authored-by: Hanumath Rao Maduri <hmaduri@maprtech.com>

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/HashJoinPOP.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTable.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTableTemplate.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinBatch.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillMergeProjectRule.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillOptiq.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjectIntoScanRule.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillScanRel.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/BroadcastExchangePrel.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/DrillDistributionTrait.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PrelUtil.java

exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java

exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetPushDownFilter.java

exec/java-exec/src/main/resources/drill-module.conf

logical/src/main/java/org/apache/drill/common/logical/StoragePluginConfig.java

Resolve merge comflicts and compilation issues.

  1. … 93 more files in changeset.
DRILL-6125: Fix possible memory leak when query is cancelled or finished.

close apache/drill#1105

  1. … 2 more files in changeset.
DRILL-5730: Mock testing improvements and interface improvements

closes #1045

  1. … 219 more files in changeset.
DRILL-5967: Fixed memory leak in OrderedPartitionSender

closes apache/drill#1073

  1. … 1 more file in changeset.
DRILL-5783, DRILL-5841, DRILL-5894: Rationalize test temp directories

This change includes:

DRILL-5783:

- A unit test is created for the priority queue in the TopN operator.

- The code generation classes passed around a completely unused function registry reference in some places so it is removed.

- The priority queue had unused parameters for some of its methods so it is removed.

DRILL-5841:

- Created standardized temp directory classes DirTestWatcher, SubDirTestWatcher, and BaseDirTestWatcher. And updated all unit tests to use them.

DRILL-5894:

- Removed the dfs_test storage plugin for tests and replaced it with the already existing dfs storage plugin.

Misc:

- General code cleanup.

- Removed unnecessary use of String.format in the tests.

This closes #984

  1. … 365 more files in changeset.
DRILL-5116: Enable generated code debugging in each Drill operator

DRILL-5052 added the ability to debug generated code. The reviewer suggested

permitting the technique to be used for all Drill operators. This PR provides

the required fixes. Most were small changes, others dealt with the rather

clever way that the existing byte-code merge converted static nested classes

to non-static inner classes, with the way that constructors were inserted

at the byte-code level and so on. See the JIRA for the details.

This code passed the unit tests twice: once with the traditional byte-code

manipulations, a second time using "plain-old Java" code compilation.

Plain-old Java is turned off by default, but can be turned on for all

operators with a single config change: see the JIRA for info. Consider

the plain-old Java option to be experimental: very handy for debugging,

perhaps not quite tested enough for production use.

close apache/drill#716

  1. … 61 more files in changeset.
DRILL-4715: Fix java compilation error in run-time generated code when query has large number of expressions.

Refactor unit test in drillbit context initialization and pass in option manager.

close apache/drill#521

  1. … 53 more files in changeset.
DRILL-4442: Move getSV2 and getSV4 methods to VectorAccessible

Up one level from previous location RecordBatch, most implementations

already implement the method as they implement RecordBatch rather than

VectorAccessible itself. Add unsupported operation exception to others.

  1. … 7 more files in changeset.
DRILL-4327: Fix rawtypes warnings in drill codebase

Fixing most rawtypes warning issues in drill modules.

Closes #347

  1. … 77 more files in changeset.
DRILL-3845: UnorderedReceiver shouldn't terminate until it receives a final batch

MergingRecordBatch doesn't wait for last batch when it's an early termination

this closes #319

  1. … 1 more file in changeset.
DRILL-4134: Allocator Improvements

- make Allocator mostly lockless

- change BaseAllocator maps to direct references

- add documentation around memory management model

- move transfer and ownership methods to DrillBuf

- Improve debug messaging.

- Fix/revert sort changes

- Remove unused fragment limit flag

- Add time to HistoricalLog events

- Remove reservation amount from RootAllocator constructor (since not allowed)

- Fix concurrency issue where allocator is closing at same moment as incoming batch transfer, causing leaked memory and/or query failure.

- Add new AutoCloseables.close(Iterable<AutoCloseable>)

- Remove extraneous DataResponseHandler and Impl (and update TestBitRpc to use smarter mock of FragmentManager)

- Remove the concept of poison pill record batches, using instead FragmentContext.isOverMemoryLimit()

- Update incoming data batches so that they are transferred under protection of a close lock

- Improve field names in IncomingBuffers and move synchronization to collectors as opposed to IncomingBuffers (also change decrementing to decrementToZero rather than two part check).

This closes #238.

  1. … 119 more files in changeset.
DRILL-3987: (REFACTOR) Common and Vector modules building.

- Extract Accountor interface from Implementation

- Separate FMPP modules to separate out Vector Needs versus external needs

- Separate out Vector classes from those that are VectorAccessible.

- Cleanup Memory Exception hiearchy

  1. … 105 more files in changeset.
DRILL-1942-hygiene: - add AutoCloseable to many classes - minor fixes - formatting

this closes #133

  1. … 30 more files in changeset.
DRILL-3035: Created ControlsInjector interface to enforce method implementations + DRILL-2867: Add ControlsValidator to VALIDATORS only if assertions are enabled + return in ExecutionControls ctor if assertions are not enabled + added InjectorFactory class to align with the logger pattern

  1. … 22 more files in changeset.
Fix PartitionSenderRootExec possible memory leak.

DRILL-2755: (part2) Use and handle InterruptedException during query processing

  1. … 26 more files in changeset.
DRILL-2755: Use and handle InterruptedException during query processing.

- Interrupt FragmentExecutor thread as part of FragmentExecutor.cancel()

- Handle InterruptedException in ExternalSortBatch.newSV2(). If the fragment status says

should not continue, then throw the InterruptedException to caller which returns IterOutcome.STOP

- Add comments reg not handling of InterruptedException in SendingAccountor.waitForSendComplete()

- Handle InterruptedException in OrderedPartitionRecordBatch.getPartitionVectors()

If interrupted in Thread.sleep calls and fragment status says should not run, then

return IterOutcome.STOP downstream.

- Interrupt partitioner threads if PartitionerRecordBatch is interrupted while waiting for

partitioner threads to complete.

- Preserve interrupt status if not handled

- Handle null RecordBatches returned by RawBatchBuffer.getNext() in MergingRecordBatch.buildSchema()

- Change timeout in Foreman to be proportional to the number of intermediate fragments sent instead

of hard coded limit of 90s.

- Change TimedRunnable to enforce a timeout of 15s per runnable.

Total timeout is (5s * numOfRunnableTasks) / parallelism.

- Add unit tests

* Testing cancelling a query interrupts the query fragments which are currently blocked

* Testing interrupting the partitioner sender which in turn interrupts its helper threads

* Testing TimedRunanble enforeces timeout for the whole task list.

  1. … 27 more files in changeset.
DRILL-2809: Increase the default value of partitioner_sender_threads_factor.

  1. … 2 more files in changeset.