Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7456: Batch count fixes for 12 operators

Enables batch validation for 12 additional operators:

* MergingRecordBatch

* OrderedPartitionRecordBatch

* RangePartitionRecordBatch

* TraceRecordBatch

* UnionAllRecordBatch

* UnorderedReceiverBatch

* UnpivotMapsRecordBatch

* WindowFrameRecordBatch

* TopNBatch

* HashJoinBatch

* ExternalSortBatch

* WriterRecordBatch

Fixes issues found with those checks so that this set of

operators passes all checks.

Includes code cleanup in many files touched during this

work.

closes #1906

  1. … 46 more files in changeset.
DRILL-7450: Improve performance for ANALYZE command

- Implement two-phase aggregation for the lowest metadata aggregate to optimize performance

- Allow using complex functions with hash aggregate

- Use hash aggregation for PHASE_1of2 for ANALYZE to reduce memory usage and avoid sorting non-aggregated data

- Add sort above hash aggregation to fix correctness of merge exchange and stream aggregate

closes #1907

    • -102
    • +103
    ./TestHashAggEmitOutcome.java
  1. … 58 more files in changeset.
DRILL-7350: Move RowSet related classes from test folder

  1. … 290 more files in changeset.
DRILL-7310: Move schema-related classes from exec module to be able to use them in metastore module

closes #1816

  1. … 102 more files in changeset.
DRILL-7273: Introduce operators for handling metadata

closes #1886

    • -55
    • +30
    ./TestStreamingAggEmitOutcome.java
  1. … 155 more files in changeset.
DRILL-6951: Merge row set based mock data source

The mock data source is used in several tests to generate a large volume

of sample data, such as when testing spilling. The mock data source also

lets us try new plugin featues in a very simple context. During the

development of the row set framework, the mock data source was converted

to use the new framework to verify functionality. This commit upgrades

the mock data source with that work.

The work changes non of the functionality. It does, however, improve

memory usage. Batchs are limited, by default, to 10 MB in size. The row

set framework minimizes internal fragmentation in the largest vector.

(Previously, internal fragmentation averaged 25% but could be as high as

50%.)

As it turns out, the hash aggregate tests depended on the internal

fragmentation: without it, the hash agg no longer spilled for the same

row count. Adjusted the generated row counts to recreate a data volume

that caused spilling.

One test in particular always failed due to assertions in the hash agg

code. These seem true bugs and are described in DRILL-7301. After

multiple failed attempts to get the test to work, it ws disabled until

DRILL-7301 is fixed.

Added a new unit test to sanity check the mock data source. (No test

already existed for this functionality except as verified via other unit

tests.)

  1. … 21 more files in changeset.
DRILL-6901: Move schema builder to src/main

Moves the SchemaBuilder class out of the src/test name space into the src/main namespace. Specifically, into the existing record.metadata package.

Many files changed in this move. Corrected two minor issues: import of the wrong Arrays class and unnecessary annotations.

  1. … 87 more files in changeset.
DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed Note: Issue was in StreamingAgg where if output from one or multiple input batch was splitting into multiple output batch, then remaining input records were discarded after producing first output batch closes #1490

    • -3
    • +312
    ./TestStreamingAggEmitOutcome.java
  1. … 6 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

  1. … 982 more files in changeset.
DRILL-6656: Disallow extra semicolons and multiple statements on the same line.

closes #1415

  1. … 143 more files in changeset.
DRILL-6654: Data verification failure with lateral unnest query having filter in and order by

closes #1418

    • -0
    • +137
    ./TestStreamingAggEmitOutcome.java
  1. … 1 more file in changeset.
DRILL-6631: Streaming agg causes queries with Lateral and Unnest to return incorrect results.

This commit fixes issues with handling straight aggregates (no group by)

with empty batches received between EMIT(s).

closes #1399

    • -0
    • +553
    ./TestStreamingAggEmitOutcome.java
  1. … 2 more files in changeset.
DRILL-6516: EMIT support in streaming agg

This closes #1358

    • -0
    • +614
    ./TestStreamingAggEmitOutcome.java
  1. … 4 more files in changeset.
DRILL-6479: Support EMIT for the Hash Aggr

closes #1311

    • -0
    • +476
    ./TestHashAggEmitOutcome.java
  1. … 7 more files in changeset.
DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework.

git closes #1344

    • -0
    • +212
    ./TestHashAggBatch.java
  1. … 35 more files in changeset.
DRILL-6438: Remove excess logging form the tests. - Removed usages of System.out and System.err from the test and replaced with loggers

closes #1284

  1. … 90 more files in changeset.
DRILL-6422: Update guava to 23.0 and shade it

- Fix compilation errors for new version of Guava.

- Remove usage of deprecated API

- Shade guava and add dependencies to the shaded version

- Ban unshaded package

- Introduce drill-shaded module and move guava-shaded under it

- Add methods to convert shaded guava lists to the unshaded ones

- Add instruction for publishing artifacts to the Apache repository

  1. … 82 more files in changeset.
DRILL-6320: Fixed license headers.

closes #1207

  1. … 2064 more files in changeset.
DRILL-6375 : Support for ANY_VALUE aggregate function

closes #1256

    • -0
    • +149
    ./TestAggWithAnyValue.java
  1. … 36 more files in changeset.
DRILL-5730: Mock testing improvements and interface improvements

closes #1045

  1. … 223 more files in changeset.
DRILL-6049: Misc. hygiene and code cleanup changes

close apache/drill#1085

  1. … 123 more files in changeset.
DRILL-5783, DRILL-5841, DRILL-5894: Rationalize test temp directories

This change includes:

DRILL-5783:

- A unit test is created for the priority queue in the TopN operator.

- The code generation classes passed around a completely unused function registry reference in some places so it is removed.

- The priority queue had unused parameters for some of its methods so it is removed.

DRILL-5841:

- Created standardized temp directory classes DirTestWatcher, SubDirTestWatcher, and BaseDirTestWatcher. And updated all unit tests to use them.

DRILL-5894:

- Removed the dfs_test storage plugin for tests and replaced it with the already existing dfs storage plugin.

Misc:

- General code cleanup.

- Removed unnecessary use of String.format in the tests.

This closes #984

  1. … 363 more files in changeset.
DRILL-5832: Change OperatorFixture to use system option manager

- Rename FixtureBuilder to ClusterFixtureBuilder

- Provide alternative way to reset system/session options

- Fix for DRILL-5833: random failure in TestParquetWriter

- Provide strict, but clear, errors for missing options

closes #970

  1. … 51 more files in changeset.
DRILL-5694: Handle OOM in HashAggr by spill and retry, reserve memory, spinner

  1. … 20 more files in changeset.
DRILL-5798 changes include:

- Fixed unstable StatusResourcesTest

- Fixed buggy port hunting for the WebService

- Fixed bug in ClientFixture which does not assign the correct user port when port hunting is done for the user port

- Fixed unstable hash agg tests

closes #945

  1. … 7 more files in changeset.
DRILL-5752 this change includes:

1. Increased test parallelism and fixed associated bugs

2. Added test categories and categorized tests appropriately

- Don't exclude anything by default

- Increase test timeout

- Fixed flakey test

closes #940

  1. … 263 more files in changeset.
DRILL-5737: Hash Agg uses more than the allocated memory under certain low memory conditions Note: Provide a new config parameter HASHAGG_FALLBACK_ENABLED which is set to true by default. When 2 Phase HashAgg doesn't have enough memory to hold 2 partitions then based on this flag it either fallsback to old behavior of consuming unbounded memory or it fails the query.

close apache/drill#920

  1. … 4 more files in changeset.
DRILL-5723: Added System Internal Options That can be Modified at Runtime Changes include:

1. Addition of internal options.

2. Refactoring of OptionManagers and OptionValidators.

3. Fixed ambiguity in the meaning of an option type, and changed its name to accessibleScopes.

4. Updated javadocs in the Option System classes.

5. Added RestClientFixture for testing the Rest API.

6. Fixed flakey test in TestExceptionInjection caused by race condition.

7. Fixed various tests which started zookeeper but failed to shut it down at the end of tests.

8. Added port hunting to the Drill Webserver for testing

9. Fixed various flaky tests

10. Fix compile issue

closes #923

  1. … 85 more files in changeset.
DRILL-5457: Spill implementation for Hash Aggregate

closes #822

    • -0
    • +141
    ./TestHashAggrSpill.java
  1. … 35 more files in changeset.
DRILL-5485: Remove WebServer dependency on DrillClient

1. Added WebUserConnection/AnonWebUserConnection and their providers for Authenticated and Anonymous web users.

2. Updated to store the UserSession, BufferAllocator and other session states inside the HttpSession of Jetty instead

of storing in DrillUserPrincipal. For each request now a new instance of WebUserConnection will be created. However

for authenticated users the UserSession and other states will be re-used whereas for Anonymous Users it will created

for each request and later re-cycled after query execution.

close #829

  1. … 46 more files in changeset.