UserBitShared.proto

Clone Tools
  • last updated 26 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-6284: Add operator metrics for batch sizing for flatten

  1. … 6 more files in changeset.
DRILL-6331: Revisit Hive Drill native parquet implementation to be exposed to Drill optimizations (filter / limit push down, count to direct scan)

1. Factored out common logic for Drill parquet reader and Hive Drill native parquet readers: AbstractParquetGroupScan, AbstractParquetRowGroupScan, AbstractParquetScanBatchCreator.

2. Rules that worked previously only with ParquetGroupScan, now can be applied for any class that extends AbstractParquetGroupScan: DrillFilterItemStarReWriterRule, ParquetPruneScanRule, PruneScanRule.

3. Hive populated partition values based on information returned from Hive metastore. Drill populates partition values based on path difference between selection root and actual file path.

Before ColumnExplorer populated partition values based on Drill approach. Since now ColumnExplorer populates values for parquet files from Hive tables,

`populateImplicitColumns` method logic was changed to populated partition columns only based on given partition values.

4. Refactored ParquetPartitionDescriptor to be responsible for populating partition values rather than storing this logic in parquet group scan class.

5. Metadata class was moved to separate metadata package (org.apache.drill.exec.store.parquet.metadata). Factored out several inner classed to improve code readability.

6. Collected all Drill native parquet reader unit tests into one class TestHiveDrillNativeParquetReader, also added new tests to cover new functionality.

7. Reduced excessive logging when parquet files metadata is read

closes #1214

  1. … 64 more files in changeset.
DRILL-6381: (Part 1) Secondary Index framework

  1. Secondary Index planning interfaces and abstract classes like DBGroupScan, DbSubScan, IndexDecriptor etc.

  2. Statistics and Cost model interfaces/classes: PluginCost, Statistics, StatisticsPayload, AbstractIndexStatistics

  3. ScanBatch and RecordReader to support repeatable scan

  4. Secondary Index execution related interfaces: RangePartitionSender, RowKeyJoin, PartitionFunction

5. MD-3979: Query using cast index plan fails with NPE

Co-authored-by: Aman Sinha <asinha@maprtech.com>

Co-authored-by: chunhui-shi <cshi@maprtech.com>

Co-authored-by: Gautam Parai <gparai@maprtech.com>

Co-authored-by: Padma Penumarthy <ppenumar97@yahoo.com>

Co-authored-by: Hanumath Rao Maduri <hmaduri@maprtech.com>

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillTable.java

protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java

protocol/src/main/java/org/apache/drill/exec/proto/beans/CoreOperatorType.java

protocol/src/main/protobuf/UserBitShared.proto

  1. … 44 more files in changeset.
DRILL-6322: Lateral Join: Common changes - Add new iterOutcome, Operatortypes, MockRecordBatch for testing

Added new Iterator State EMIT, added operatos LATERA_JOIN & UNNEST in CoreOperatorType and added LateralContract interface

Implementation of MockRecordBatch to test operator behavior for different IterOutcomes. a) Creates new output container for schema change cases. b) Doesn't create new container for each next() call without schema change, since the operator in test expects the ValueVector object in it's incoming batch to be same unless a OK_NEW_SCHEMA case is hit. Since setup() method of operator in test will store the reference to value vector received in first batch

This closes #1211

  1. … 6 more files in changeset.
DRILL-6130: Fix NPE during physical plan submission for various storage plugins

1. Fixed ser / de issues for Hive, Kafka, Hbase plugins.

2. Added physical plan submission unit test for all storage plugins in contrib module.

3. Refactoring.

closes #1108

  1. … 26 more files in changeset.
DRILL-6049: Misc. hygiene and code cleanup changes

close apache/drill#1085

  1. … 123 more files in changeset.
DRILL-4779: Kafka storage plugin (Kamesh Bhallamudi & Anil Kumar Batchu)

closes #1052

  1. … 36 more files in changeset.
DRILL-5963: Query state process improvements

1. Added two new query states: PREPARING (when foreman is initialized) and PLANNING (includes logical and / or physical planning).

2. Ability to cancel query during planning and enqueued states was added.

3. Logic for submitting fragments was moved from Foreman to new class FragmentsRunner.

4. Logic for moving query from to new state and incrementing / decrementing query counters was moved into QueryStateProcessor class.

5. Major type in DrillFuncHolderExpr was cached for better performance.

closes #1051

  1. … 12 more files in changeset.
DRILL-5716: Queue-driven memory allocation

* Creates new core resource management and query queue abstractions.

* Adds queue information to the Protobuf layer.

* Foreman and Planner changes

- Abstracts memory management out to the new resource management layer.

This means deferring generating the physical plan JSON to later in the

process after memory planning.

* Web UI changes

* Adds queue information to the main page and the profile page to each

query.

* Also sorts the list of options displayed in the Web UI.

- Added memory reserve

A new config parameter, exec.queue.memory_reserve_ratio, sets aside a

slice of total memory for operators that do not participate in the

memory assignment process. The default is 20% testing will tell us if

that value should be larger or smaller.

* Additional minor fixes

- Code cleanup.

- Added mechanism to abandon lease release during shutdown.

- Log queue configuration only when the config changes, rather than on

every query.

- Apply Boaz’ option to enforce a minimum memory allocation per

operator.

- Additional logging to help testers see what is happening.

closes #928

  1. … 57 more files in changeset.
DRILL-5512: Standardize error handling in ScanBatch

Standardizes error handling to throw a UserException. Prior code threw

various exceptions, called the fail() method, or returned a variety of

status codes.

closes #838

  1. … 5 more files in changeset.
DRILL-5190: Display planning and queued time for a query's profile page

Modified UserSharedBit protobuf for marking planning and wait-in-queue end times. This will allow for accurately reporting the planning, queued and actual execution times of a query.

Planning Time:

In the absence of the planning time's end, for older profiles, the root fragment's (i.e. SCREEN operator) start time is taken as the estimated end of planning time, and as the estimated start time of the execution phase.

QueueWait Time:

We do not estimate the queue time if the planning end time is not available.

Execution Time:

We calculate the execution time based on the availability of these 2 planning time. The computation is done the following way, and reflects a decreasing level of accuracy

1. Execution time = [end(QueueWait) - endTime(Query)]

2. Execution time = [end(Planning) - endTime(Query)]

3. Execution time = [start(rootFragment) - endTime(Query)] - {Estimated}

closes #738

  1. … 7 more files in changeset.
DRILL-4280: CORE (Java protocol)

+ Define SaslStatus and SaslMessage messages in protocol

+ Add "authenticationMechanisms" field to all handshakes

+ Add "saslSupport” field to UserToBitHandshake

  1. … 20 more files in changeset.
DRILL-4726: Dynamic UDF Support

1) Configuration / parsing / options / protos

2) Zookeeper integration

3) Registration / unregistration / lazy-init

4) Unit tests

This closes #574

  1. … 71 more files in changeset.
DRILL-4792: Include session options used for a query as part of the profile

closes #551

  1. … 8 more files in changeset.
DRILL-4729: Add support for prepared statement implementation on server side

+ Add following APIs for Drill Java client

- DrillRpcFuture<CreatePreparedStatementResp> createPreparedStatement(final String query)

- void executePreparedStatement(final PreparedStatement preparedStatement, UserResultsListener resultsListener)

- List<QueryDataBatch> executePreparedStatement(final PreparedStatement preparedStatement) (for testing purpose)

+ Separated out the interface from UserClientConnection. It makes it easy to have wrappers which need to

tap the messages and data going to the actual client.

+ Implement CREATE_PREPARED_STATEMENT and handle RunQuery with PreparedStatement

+ Test changes to support prepared statement as query type

+ Add tests in TestPreparedStatementProvider

this closes #530

  1. … 37 more files in changeset.
DRILL-4132 Ability to submit simple type of physical plan directly to EndPoint DrillBit for execution. There are multiple changes to achieve this: 1. During physical planning split single plan into multiple based on the number of minor fragments of the Leaf Major fragment. a. Removing exchange operators during planning b. Producing just root fragments (that will be also leaf fragments) 2. Each fragment can be executed against Drillbit it is assigned to, so to keep locality Design document can be found in the JIRA: DRILL-4132

  1. … 27 more files in changeset.
DRILL-4187: introduce a new query state ENQUEUED and rename the state PENDING to STARTING

  1. … 7 more files in changeset.
DRILL-3340: Part 2: Reverting 1a589ab and committing latest patch

Add operator metrics registry for metric definitions

+ Display metrics as a table within an operator profile panel

+ Rename FragmentStats#getOperatorStats to newOperatorStats

  1. … 14 more files in changeset.
DRILL-3340: Added operator names and metric names to query profile before writing it to store

+ Rename: FragmentStats#getOperatorStats => newOperatorStats

+ Documentation

this closes #216

  1. … 12 more files in changeset.
DRILL-2997: Remove references to groupCount from SerializedField

  1. … 17 more files in changeset.
DRILL-2981: Add queries log. Update profile to store normal and verbose exception as well as node and errorid.

  1. … 20 more files in changeset.
DRILL-3010: Convert bad command error messages into UserExceptions in SqlHandlers

  1. … 22 more files in changeset.
DRILL-2675 (PART-2): Implement a subset of User Exceptions to improve how errors are reported to the user

Added missing changes from committed patch

  1. … 35 more files in changeset.
DRILL-2675: Implement a subset of User Exceptions to improve how errors are reported to the user

  1. … 38 more files in changeset.
DRILL-2762: Update Fragment state reporting and error collection

DeferredException

- Add new throwAndClear operation on to allow checking for exceptions preClose in FragmentContext

- Add new getAndClear operation

BufferManager

- Ensure close() can be called multiple times by clearing managed buffer list on close().

FragmentContext/FragmentExecutor

- Update FragmentContext to have a preClose so that we can check closure state before doing final close.

- Update so that there is only a single state maintained between FragmentContext and FragmentExecutor

- Clean up FragmentExecutor run() method to better manage error states and have only single terminal point (avoiding multiple messages to Foreman).

- Add new CANCELLATION_REQUESTED state for FragmentState.

- Move all users of isCancelled or isFailed in main code to use shouldContinue()

- Update receivingFragmentFinished message to not cancel fragment (only inform root operator of cancellation)

WorkManager Updates

- Add new afterExecute command to the WorkManager ExecutorService so that we get log entries if a thread leaks an exception. (Otherwise logs don't show these exceptions and they only go to standard out.)

Profile Page

- Update profile page to show last update and last progress.

- Change durations to non-time presentation

Foreman/QueryManager

- Extract listenable interfaces into anonymous inner classes from body of Foreman

QueryManager

- Update QueryManager to track completed nodes rather than completed fragments using NodeTracker

- Update DrillbitStatusListener to decrement expected completion messages on Nodes that have died to avoid query hang when a node dies

FragmentData/MinorFragmentProfile

- Add ability to track last status update as well as last time fragment made progress

AbstractRecordBatch

- Update awareness of current cancellation state to avoid cancellation delays

Misc. Other changes

- Move ByteCode optimization code to only record assembly and code as trace messages

- Update SimpleRootExec to create fake ExecutorState to make existing tests work.

- Update sort to exit prematurely in the case that the fragment was asked to cancel.

- Add finals to all edited files.

- Modify control handler and FragmentManager to directly support receivingFragmentFinished

- Update receiver propagation message to avoid premature removal of fragment manager

- Update UserException.Builder to log a message if we're creating a new UserException (ERROR for System, INFO otherwise).

- Update Profile pages to use min and max instead of sorts.

  1. … 44 more files in changeset.
DRILL-2498: Separate QueryResult into two messages QueryResult and QueryData

  1. … 71 more files in changeset.
DRILL-2245: Clean up query setup and execution kickoff in Foreman/WorkManager in order to ensure consistent handling, and avoid hangs and races, with the goal of improving Drillbit robustness.

I did my best to keep these clean when I split them up, but this core commit

may depend on some minor changes in the hygiene commit that is also

associated with this bug, so either both should be applied, or neither.

The core commit should be applied first.

protocol/pom.xml

- updated protocol buffer compiler version to 2.6

- this made slight modifications to the formats of a few committed protobuf

files

AutoCloseables

- created org.apache.drill.common.AutoCloseables to handle closing these

quietly

BaseTestQuery, and derivatives

- factored out pieces into QueryTestUtil so they can be reused

DeferredException:

- created this so we can collect exceptions during the shutdown process

Drillbit

- uses AutoCloseables for the WorkManager and for the storeProvider

- allow start() to take a RemoteServiceSet

- private, final, formatting

Foreman

- added new state CANCELLATION_REQUESTED (via UserBitShared.proto) to represent

the time between request of a cancellation, and acknowledgement from all

remote endpoints running fragments on a query's behalf

- created ForemanResult to manage interleaving cleanup effects/failure with

query result state

- does not need to implement Comparable

- does not need to implement Closeable

- thread blocking fixes

- add resultSent flag

- add code to log plan fragments with endpoint assignments

- added finals, cleaned up formatting

- do queue management in acquireQuerySemaphore; local tests pass

- rename getContext() to getQueryContext()

- retain DrillbitContext

- a couple of exception injections for testing

- minor formatting

- TODOs

FragmentContext

- added a DeferredException to collect errors during startup/shutdown sequences

FragmentExecutor

- eliminated CancelableQuery

- use the FragmentContext's DeferredException for errors

- common subexpression elimination

- cleaned up

QueryContext

- removed unnecessary functions (with some outside classes tweaked for this)

- finals, formatting

QueryManager

- merge in QueryStatus

- affects Foreman, ../batch/ControlHandlerImpl,

and ../../server/rest/ProfileResources

- made some methods private

- removed unused imports

- add finals and formatting

- variable renaming to improve readability

- formatting

- comments

- TODOs

QueryStatus

- getAsInfo() private

- member renaming

- member access changes

- formatting

- TODOs

QueryTestUtil, BaseTestQuery, TestDrillbitResilience

- make maxWidth a parameter to server startup

SelfCleaningRunnable

- created org.apache.drill.common.SelfCleaningRunnable

SingleRowListener

- created org.apache.drill.SingleRowListener results listener

- use in TestDrillbitResilience

TestComparisonFunctions

- fix not to close the FragmentContext multiple times

TestDrillbitResilience

- created org.apache.drill.exec.server.TestDrillbitResilience to test drillbit

resilience in the face of exceptions and failures during queries

TestWithZookeeper

- factor out work into ZookeeperHelper so that it can be reused by

TestDrillbitResilience

UserBitShared

- get rid of unused UNKNOWN_QUERY

WorkEventBus

- rename methods, affects Foreman and ControlHandlerImpl

- remove unused WorkerBee reference

- most members final

- formatting

WorkManager

- Closeable to AutoCloseable

- removed unused incomingFragments Set

- eliminated unnecessary eventThread and pendingTasks by posting Runnables

directly to executor

- use SelfCleaningRunnable for Foreman management

- FragmentExecutor management uses SelfCleaningRunnable

- runningFragments to be a ConcurrentHashMap; TestTpchDistributed passes

- other improvements due to bee no longer needed in various places

- most members final

- minor formatting

- comments

- TODOs

(*) Created exception injection classes to simulate exceptions for testing

- ExceptionInjection

- ExceptionInjector

- ExceptionInjectionUtil

- TestExceptionInjection

DRILL-2245-hygiene: General code cleanup encountered while working on the rest

of this commit. This includes

- making members final whenever possible

- making members private whenever possible

- making loggers private

- removing unused imports

- removing unused private functions

- removing unused public functions

- removing unused local variables

- removing unused private members

- deleting unused files

- cleaning up formatting

- adding spaces before braces in conditionals and loop bodies

- breaking up overly long lines

- removing extra blank lines

While I tried to keep this clean, this commit may have minor dependencies on

DRILL-2245-core that I missed. The intention is just to break this up for

review purposes. Either both commits should be applied, or neither.

  1. … 93 more files in changeset.
DRILL-2715: Implement nested loop join operator

  1. … 9 more files in changeset.
DRILL-1990: Add peak memory allocation in a operator to OperatorStats.

  1. … 8 more files in changeset.
DRILL-1684, DRILL-1517, DRILL-1350: Profile and cancellation updates - Remove any storage of persisted profiles. - Store a separate query info object for active queries. - Update cancellation and running profile loading to query foreman server. - Make file store support HDFS APIs - Update PStoreProvider to use configuration to decide if you want PERSISTENT, EPHEMERAL, or BLOB storage rather than separate interfaces. - Update ZkPStore's persistent mode to leverage a cache and respond to changes rather than actively probing values. - Update ZkPStore's cache to be effectively write-through. - Automatically delete deprecated or default value options from PStore.

  1. … 42 more files in changeset.