UserBitShared.proto

Clone Tools
  • last updated 27 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-5512: Standardize error handling in ScanBatch

Standardizes error handling to throw a UserException. Prior code threw

various exceptions, called the fail() method, or returned a variety of

status codes.

closes #838

  1. … 5 more files in changeset.
DRILL-5190: Display planning and queued time for a query's profile page

Modified UserSharedBit protobuf for marking planning and wait-in-queue end times. This will allow for accurately reporting the planning, queued and actual execution times of a query.

Planning Time:

In the absence of the planning time's end, for older profiles, the root fragment's (i.e. SCREEN operator) start time is taken as the estimated end of planning time, and as the estimated start time of the execution phase.

QueueWait Time:

We do not estimate the queue time if the planning end time is not available.

Execution Time:

We calculate the execution time based on the availability of these 2 planning time. The computation is done the following way, and reflects a decreasing level of accuracy

1. Execution time = [end(QueueWait) - endTime(Query)]

2. Execution time = [end(Planning) - endTime(Query)]

3. Execution time = [start(rootFragment) - endTime(Query)] - {Estimated}

closes #738

  1. … 7 more files in changeset.
DRILL-4280: CORE (Java protocol)

+ Define SaslStatus and SaslMessage messages in protocol

+ Add "authenticationMechanisms" field to all handshakes

+ Add "saslSupport” field to UserToBitHandshake

  1. … 20 more files in changeset.
DRILL-4726: Dynamic UDF Support

1) Configuration / parsing / options / protos

2) Zookeeper integration

3) Registration / unregistration / lazy-init

4) Unit tests

This closes #574

  1. … 71 more files in changeset.
DRILL-4792: Include session options used for a query as part of the profile

closes #551

  1. … 8 more files in changeset.
DRILL-4729: Add support for prepared statement implementation on server side

+ Add following APIs for Drill Java client

- DrillRpcFuture<CreatePreparedStatementResp> createPreparedStatement(final String query)

- void executePreparedStatement(final PreparedStatement preparedStatement, UserResultsListener resultsListener)

- List<QueryDataBatch> executePreparedStatement(final PreparedStatement preparedStatement) (for testing purpose)

+ Separated out the interface from UserClientConnection. It makes it easy to have wrappers which need to

tap the messages and data going to the actual client.

+ Implement CREATE_PREPARED_STATEMENT and handle RunQuery with PreparedStatement

+ Test changes to support prepared statement as query type

+ Add tests in TestPreparedStatementProvider

this closes #530

  1. … 37 more files in changeset.
DRILL-4132 Ability to submit simple type of physical plan directly to EndPoint DrillBit for execution. There are multiple changes to achieve this: 1. During physical planning split single plan into multiple based on the number of minor fragments of the Leaf Major fragment. a. Removing exchange operators during planning b. Producing just root fragments (that will be also leaf fragments) 2. Each fragment can be executed against Drillbit it is assigned to, so to keep locality Design document can be found in the JIRA: DRILL-4132

  1. … 27 more files in changeset.
DRILL-4187: introduce a new query state ENQUEUED and rename the state PENDING to STARTING

  1. … 7 more files in changeset.
DRILL-3340: Part 2: Reverting 1a589ab and committing latest patch

Add operator metrics registry for metric definitions

+ Display metrics as a table within an operator profile panel

+ Rename FragmentStats#getOperatorStats to newOperatorStats

  1. … 14 more files in changeset.
DRILL-3340: Added operator names and metric names to query profile before writing it to store

+ Rename: FragmentStats#getOperatorStats => newOperatorStats

+ Documentation

this closes #216

  1. … 12 more files in changeset.
DRILL-2997: Remove references to groupCount from SerializedField

  1. … 17 more files in changeset.
DRILL-2981: Add queries log. Update profile to store normal and verbose exception as well as node and errorid.

  1. … 20 more files in changeset.
DRILL-3010: Convert bad command error messages into UserExceptions in SqlHandlers

  1. … 22 more files in changeset.
DRILL-2675 (PART-2): Implement a subset of User Exceptions to improve how errors are reported to the user

Added missing changes from committed patch

  1. … 35 more files in changeset.
DRILL-2675: Implement a subset of User Exceptions to improve how errors are reported to the user

  1. … 38 more files in changeset.
DRILL-2762: Update Fragment state reporting and error collection

DeferredException

- Add new throwAndClear operation on to allow checking for exceptions preClose in FragmentContext

- Add new getAndClear operation

BufferManager

- Ensure close() can be called multiple times by clearing managed buffer list on close().

FragmentContext/FragmentExecutor

- Update FragmentContext to have a preClose so that we can check closure state before doing final close.

- Update so that there is only a single state maintained between FragmentContext and FragmentExecutor

- Clean up FragmentExecutor run() method to better manage error states and have only single terminal point (avoiding multiple messages to Foreman).

- Add new CANCELLATION_REQUESTED state for FragmentState.

- Move all users of isCancelled or isFailed in main code to use shouldContinue()

- Update receivingFragmentFinished message to not cancel fragment (only inform root operator of cancellation)

WorkManager Updates

- Add new afterExecute command to the WorkManager ExecutorService so that we get log entries if a thread leaks an exception. (Otherwise logs don't show these exceptions and they only go to standard out.)

Profile Page

- Update profile page to show last update and last progress.

- Change durations to non-time presentation

Foreman/QueryManager

- Extract listenable interfaces into anonymous inner classes from body of Foreman

QueryManager

- Update QueryManager to track completed nodes rather than completed fragments using NodeTracker

- Update DrillbitStatusListener to decrement expected completion messages on Nodes that have died to avoid query hang when a node dies

FragmentData/MinorFragmentProfile

- Add ability to track last status update as well as last time fragment made progress

AbstractRecordBatch

- Update awareness of current cancellation state to avoid cancellation delays

Misc. Other changes

- Move ByteCode optimization code to only record assembly and code as trace messages

- Update SimpleRootExec to create fake ExecutorState to make existing tests work.

- Update sort to exit prematurely in the case that the fragment was asked to cancel.

- Add finals to all edited files.

- Modify control handler and FragmentManager to directly support receivingFragmentFinished

- Update receiver propagation message to avoid premature removal of fragment manager

- Update UserException.Builder to log a message if we're creating a new UserException (ERROR for System, INFO otherwise).

- Update Profile pages to use min and max instead of sorts.

  1. … 44 more files in changeset.
DRILL-2498: Separate QueryResult into two messages QueryResult and QueryData

  1. … 71 more files in changeset.
DRILL-2245: Clean up query setup and execution kickoff in Foreman/WorkManager in order to ensure consistent handling, and avoid hangs and races, with the goal of improving Drillbit robustness.

I did my best to keep these clean when I split them up, but this core commit

may depend on some minor changes in the hygiene commit that is also

associated with this bug, so either both should be applied, or neither.

The core commit should be applied first.

protocol/pom.xml

- updated protocol buffer compiler version to 2.6

- this made slight modifications to the formats of a few committed protobuf

files

AutoCloseables

- created org.apache.drill.common.AutoCloseables to handle closing these

quietly

BaseTestQuery, and derivatives

- factored out pieces into QueryTestUtil so they can be reused

DeferredException:

- created this so we can collect exceptions during the shutdown process

Drillbit

- uses AutoCloseables for the WorkManager and for the storeProvider

- allow start() to take a RemoteServiceSet

- private, final, formatting

Foreman

- added new state CANCELLATION_REQUESTED (via UserBitShared.proto) to represent

the time between request of a cancellation, and acknowledgement from all

remote endpoints running fragments on a query's behalf

- created ForemanResult to manage interleaving cleanup effects/failure with

query result state

- does not need to implement Comparable

- does not need to implement Closeable

- thread blocking fixes

- add resultSent flag

- add code to log plan fragments with endpoint assignments

- added finals, cleaned up formatting

- do queue management in acquireQuerySemaphore; local tests pass

- rename getContext() to getQueryContext()

- retain DrillbitContext

- a couple of exception injections for testing

- minor formatting

- TODOs

FragmentContext

- added a DeferredException to collect errors during startup/shutdown sequences

FragmentExecutor

- eliminated CancelableQuery

- use the FragmentContext's DeferredException for errors

- common subexpression elimination

- cleaned up

QueryContext

- removed unnecessary functions (with some outside classes tweaked for this)

- finals, formatting

QueryManager

- merge in QueryStatus

- affects Foreman, ../batch/ControlHandlerImpl,

and ../../server/rest/ProfileResources

- made some methods private

- removed unused imports

- add finals and formatting

- variable renaming to improve readability

- formatting

- comments

- TODOs

QueryStatus

- getAsInfo() private

- member renaming

- member access changes

- formatting

- TODOs

QueryTestUtil, BaseTestQuery, TestDrillbitResilience

- make maxWidth a parameter to server startup

SelfCleaningRunnable

- created org.apache.drill.common.SelfCleaningRunnable

SingleRowListener

- created org.apache.drill.SingleRowListener results listener

- use in TestDrillbitResilience

TestComparisonFunctions

- fix not to close the FragmentContext multiple times

TestDrillbitResilience

- created org.apache.drill.exec.server.TestDrillbitResilience to test drillbit

resilience in the face of exceptions and failures during queries

TestWithZookeeper

- factor out work into ZookeeperHelper so that it can be reused by

TestDrillbitResilience

UserBitShared

- get rid of unused UNKNOWN_QUERY

WorkEventBus

- rename methods, affects Foreman and ControlHandlerImpl

- remove unused WorkerBee reference

- most members final

- formatting

WorkManager

- Closeable to AutoCloseable

- removed unused incomingFragments Set

- eliminated unnecessary eventThread and pendingTasks by posting Runnables

directly to executor

- use SelfCleaningRunnable for Foreman management

- FragmentExecutor management uses SelfCleaningRunnable

- runningFragments to be a ConcurrentHashMap; TestTpchDistributed passes

- other improvements due to bee no longer needed in various places

- most members final

- minor formatting

- comments

- TODOs

(*) Created exception injection classes to simulate exceptions for testing

- ExceptionInjection

- ExceptionInjector

- ExceptionInjectionUtil

- TestExceptionInjection

DRILL-2245-hygiene: General code cleanup encountered while working on the rest

of this commit. This includes

- making members final whenever possible

- making members private whenever possible

- making loggers private

- removing unused imports

- removing unused private functions

- removing unused public functions

- removing unused local variables

- removing unused private members

- deleting unused files

- cleaning up formatting

- adding spaces before braces in conditionals and loop bodies

- breaking up overly long lines

- removing extra blank lines

While I tried to keep this clean, this commit may have minor dependencies on

DRILL-2245-core that I missed. The intention is just to break this up for

review purposes. Either both commits should be applied, or neither.

  1. … 93 more files in changeset.
DRILL-2715: Implement nested loop join operator

  1. … 9 more files in changeset.
DRILL-1990: Add peak memory allocation in a operator to OperatorStats.

  1. … 8 more files in changeset.
DRILL-1684, DRILL-1517, DRILL-1350: Profile and cancellation updates - Remove any storage of persisted profiles. - Store a separate query info object for active queries. - Update cancellation and running profile loading to query foreman server. - Make file store support HDFS APIs - Update PStoreProvider to use configuration to decide if you want PERSISTENT, EPHEMERAL, or BLOB storage rather than separate interfaces. - Update ZkPStore's persistent mode to leverage a cache and respond to changes rather than actively probing values. - Update ZkPStore's cache to be effectively write-through. - Automatically delete deprecated or default value options from PStore.

  1. … 42 more files in changeset.
DRILL-1436: Remove use of UDP based cache for purposes of intermediate PlanFragment distribution

Includes:

- Remove dependency on Infinispan

- Update initialize fragments to send in batches.

- Update RPC layer to capture UserRpcExceptions and propagate back.

- Send full stack trace in DrillPBError and let foreman node decide on formatting.

- Increment control rpc version

- Update systables to report current drillbit and version

  1. … 65 more files in changeset.
DRILL-1512: Avro record reader

Reader for Avro data files.

Supports:

- All primitive types

- Arrays

- Nested records

- Enums

Unimplemented:

- Endpoint affinity

- Recursive data types

- Complex types: Maps, Fixed, Unions

  1. … 15 more files in changeset.
Patch for DRILL-705

Currently only supports partitioning/ordering, not yet preceding or

after offsets

  1. … 77 more files in changeset.
DRILL-1425: Handle unknown operators in web ui

  1. … 3 more files in changeset.
DRILL-1328: Support table statistics - Part 2

Add support for avg row-width and major type statistics.

Parallelize the ANALYZE implementation and stats UDF implementation to improve stats collection performance.

Update/fix rowcount, selectivity and ndv computations to improve plan costing.

Add options for configuring collection/usage of statistics.

Add new APIs and implementation for stats writer (as a precursor to Drill Metastore APIs).

Fix several stats/costing related issues identified while running TPC-H nad TPC-DS queries.

Add support for CPU sampling and nested scalar columns.

Add more testcases for collection and usage of statistics and fix remaining unit/functional test failures.

Thanks to Venki Korukanti (@vkorukanti) for the description below (modified to account for new changes). He graciously agreed to rebase the patch to latest master, fixed few issues and added few tests.

FUNCS: Statistics functions as UDFs:

Separate

Currently using FieldReader to ensure consistent output type so that Unpivot doesn't get confused. All stats columns should be Nullable, so that stats functions can return NULL when N/A.

* custom versions of "count" that always return BigInt

* HyperLogLog based NDV that returns BigInt that works only on VarChars

* HyperLogLog with binary output that only works on VarChars

OPS: Updated protobufs for new ops

OPS: Implemented StatisticsMerge

OPS: Implemented StatisticsUnpivot

ANALYZE: AnalyzeTable functionality

* JavaCC syntax more-or-less copied from LucidDB.

* (Basic) AnalyzePrule: DrillAnalyzeRel -> UnpivotPrel StatsMergePrel FilterPrel(for sampling) StatsAggPrel ScanPrel

ANALYZE: Add getMetadataTable() to AbstractSchema

USAGE: Change field access in QueryWrapper

USAGE: Add getDrillTable() to DrillScanRelBase and ScanPrel

* since ScanPrel does not inherit from DrillScanRelBase, this requires adding a DrillTable to the constructor

* This is done so that a custom ReflectiveRelMetadataProvider can access the DrillTable associated with Logical/Physical scans.

USAGE: Attach DrillStatsTable to DrillTable.

* DrillStatsTable represents the data scanned from a corresponding ".stats.drill" table

* In order to avoid doing query execution right after the ".stats.drill" table is found, metadata is not actually collected until the MaterializationVisitor is used.

** Currently, the metadata source must be a string (so that a SQL query can be created). Doing this with a table is probably more complicated.

** Query is set up to extract only the most recent statistics results for each column.

closes #729

  1. … 143 more files in changeset.
DRILL-1055: Add ProducerConsumer operator to scans

This can be disabled. The queue size is configurable

  1. … 13 more files in changeset.
DRILL-1069: Rename RandomReceiver to UnorderedRecevier.

  1. … 12 more files in changeset.
DRILL-836: [addendum] Drill needs to return complex types (e.g., map and array) as a JSON string

* This contains additional changes to the original patch which was merged.

+ Renamed "flatten" to "complex-to-json"

+ With the new patch, we return VARCHAR instead of VARBINARY.

+ Added test case.

+ Minor code re-factoring.

  1. … 38 more files in changeset.
Fix and improve runtime stats profiles - Stop stats processing while waiting for next. - Fix stats collection in PartitionSender and ScanBatch - Add stats to all senders - Add wait time to operator profile.

  1. … 16 more files in changeset.