drill

Clone Tools
  • last updated 17 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-6289: Cluster view should show more relevant information

Protobuf change to carry HTTP port info

Allow CORS for access to remote Drillbit metrics

Cross-origin resource sharing (CORS) is required to ensure that the WebServer is able serve REST calls for status pages.

Materialize relevant metrics

1. Heap memory (incl usage)

2. Heap memory (incl usage)

3. Average System Load (last 1 min)

4. Option to view from other nodes (pop out)

5. Added Glyphicons

Update System Table and related tests

1. Updated System Table to show HTTP port

2. Updated unit tests

Skip updating remote bit info when HTTPS (SSL) or Authentication is enabled.

Default CpuGaugeSet is public; Added Gauges

* CPU Utiization by Drill

* Uptime

Show ALL Buttons, but do HTTPS Check

Reduce power button to icon

Allowing CORS for /status/metrics only

Accounting for situations when JVM does not report Process CPU Load

i.e. returned value is negative.

See https://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/OperatingSystemMXBean.html#getProcessCpuLoad()

Addressed shutdown security conditions

Added C++ Client Protobuf

Added steps for Protobuf generation to protocol/readme.txt

This closes #1203

    • -0
    • +1
    /protocol/src/main/protobuf/Coordination.proto
DRILL-6271: Updated copyright range in NOTICE

closes #1188

DRILL-6290: Refactor TestInfoSchemaFilterPushDown tests to use PlanTestBase utility methods

closes #1186

DRILL-6224: Publish metrics gauge values correctly

The `metrics.ftl` page had gauges incorrectly set to near zero values. The commit for metrics.ftl fixes that, and also provides an estimate of the current direct memory actively in use (based on the `drill.allocator.root.used` value reported by the Drillbit)

closes #1160

DRILL-6288: Upgrade org.javassist:javassist and org.reflections:reflections

closes #1185

1.13 doc updates - cgroups memory updates

    • -0
    • +94
    /_docs/configure-drill/121-configuring-cgroups-to-control-cpu-usage.md
    • -94
    • +0
    /_docs/drill-on-yarn/094-appendix-e-using-cgroups-to-control-cpu-usage.md
DRILL-6287: apache-release profile should be disabled by default

closes #1182

DRILL-6283: WebServer stores SPNEGO client principal without taking any conversion rule

closes #1180

DRILL-6340 Output Batch Control in Project using the RecordBatchSizer

Changes required to implement Output Batch Sizing in Project using the RecordBatchSizer.

closes #1302

  1. … 42 more files in changeset.
DRILL-6254: IllegalArgumentException: the requested size must be non-negative

close apache/drill#1179

DRILL-6282: Update Drill's Metrics dependencies

- Replacing com.codahale.metrics with last io.dropwizard.metrics Metrics for Drill

- com.yammer.metrics is removed, since isn't used directly by Drill

closes #1189

DRILL-6278: Removed temp codegen directory in testing framework.

close apache/drill#1178

DRILL-6280: Cleanup execution of BuildTimeScan during maven build

closes #1177

edit content and add syntax highlighting info

    • binary
    /_docs/img/ctas-1.png
    • binary
    /_docs/img/ctas-2.png
    • binary
    /_docs/img/query-1.png
    • binary
    /_docs/img/query-2.png
    • binary
    /_docs/img/storagep-1.png
    • binary
    /_docs/img/storagep-2.png
    • -2
    • +2
    /_docs/install/045-distributed-mode-prerequisites.md
    • -4
    • +4
    /_docs/install/050-starting-drill-in-distributed-mode.md
DRILL-6284: Add operator metrics for batch sizing for flatten

DRILL-6331: Revisit Hive Drill native parquet implementation to be exposed to Drill optimizations (filter / limit push down, count to direct scan)

1. Factored out common logic for Drill parquet reader and Hive Drill native parquet readers: AbstractParquetGroupScan, AbstractParquetRowGroupScan, AbstractParquetScanBatchCreator.

2. Rules that worked previously only with ParquetGroupScan, now can be applied for any class that extends AbstractParquetGroupScan: DrillFilterItemStarReWriterRule, ParquetPruneScanRule, PruneScanRule.

3. Hive populated partition values based on information returned from Hive metastore. Drill populates partition values based on path difference between selection root and actual file path.

Before ColumnExplorer populated partition values based on Drill approach. Since now ColumnExplorer populates values for parquet files from Hive tables,

`populateImplicitColumns` method logic was changed to populated partition columns only based on given partition values.

4. Refactored ParquetPartitionDescriptor to be responsible for populating partition values rather than storing this logic in parquet group scan class.

5. Metadata class was moved to separate metadata package (org.apache.drill.exec.store.parquet.metadata). Factored out several inner classed to improve code readability.

6. Collected all Drill native parquet reader unit tests into one class TestHiveDrillNativeParquetReader, also added new tests to cover new functionality.

7. Reduced excessive logging when parquet files metadata is read

closes #1214

  1. … 50 more files in changeset.
DRILL-6248: Added limit push down support for system tables

1. PojoRecordReader started returning data in batches instead of returing all in one batch. Default batch size is 4000.

2. SystemTableScan supports limit push down while extrating data in record reader based on given max records to read.

3. Profiles and profiles_json tables apply limit push down while extracting data from store accessing profiles by range.

closes #1183

edits

    • -2
    • +2
    /_docs/install/050-starting-drill-in-distributed-mode.md
updated data file

add cgroup doc for DoY

    • -0
    • +94
    /_docs/drill-on-yarn/094-appendix-e-using-cgroups-to-control-cpu-usage.md
    • -5
    • +5
    /_docs/install/050-starting-drill-in-distributed-mode.md
DRILL-4286 doc for graceful shutdown feature

    • -6
    • +47
    /_docs/install/050-starting-drill-in-distributed-mode.md
DRILL-6275: Fixed direct memory reporting in sys.memory.

closes #1176

DRILL-6323: Lateral Join - Lateral Join Batch Memory manager support using the record batch sizer

apache drill 1.13 updates

Doc and website updates for the 1.13 release

    • -0
    • +4
    /_docs/031-drill-on-yarn.md
    • -0
    • +47
    /_docs/drill-on-yarn/010-drill-on-yarn-introduction.md
    • -0
    • +237
    /_docs/drill-on-yarn/020-creating-a-basic-drill-cluster.md
    • -0
    • +30
    /_docs/drill-on-yarn/030-launch-drill-under-yarn.md
    • -0
    • +84
    /_docs/drill-on-yarn/040-configuration-reference.md
    • -0
    • +104
    /_docs/drill-on-yarn/050-drill-on-yarn-command-line-tool.md
    • -0
    • +83
    /_docs/drill-on-yarn/060-using-the-drill-on-yarn-web-ui.md
    • -0
    • +70
    /_docs/drill-on-yarn/070-multiple-drill-clusters.md
    • -0
    • +32
    /_docs/drill-on-yarn/080-enabling-web-ui-security.md
    • -0
    • +173
    /_docs/drill-on-yarn/090-appendix-a-release-note-issues.md
    • -0
    • +18
    /_docs/drill-on-yarn/091-appendix-b-drill-env.sh-settings.md
    • -0
    • +96
    /_docs/drill-on-yarn/092-appendix-c-troubleshooting.md
    • -0
    • +31
    /_docs/drill-on-yarn/093-appendix-d-recreate-the-drill-archive.md
  1. … 8 more files in changeset.
DRILL-6250: Sqlline start command with password appears in the sqlline.log

closes #1174

DRILL-6262: IndexOutOfBoundException in RecordBatchSize for empty variableWidthVector

closes #1175

DRILL-6381: (Part 3) Planner and Execution implementation to support Secondary Indexes

  1. Index Planning Rules and Plan generators

    - DbScanToIndexScanRule: Top level physical planning rule that drives index planning for several relational algebra patterns.

- DbScanSortRemovalRule: Physical planning rule for index planning for Sort-based operations.

    - Plan Generators: Covering, Non-Covering and Intersect physical plan generators.

    - Support planning with functional indexes such as CAST functions.

    - Enhance PlannerSettings with several configuration options for indexes.

  2. Index Selection and Statistics

    - An IndexSelector that support cost-based index selection of covering and non-covering indexes using statistics and collation properties.

    - Costing of index intersection for comparison with single-index plans.

  3. Planning and execution operators

    - Support RangePartitioning physical operator during query planning and execution.

    - Support RowKeyJoin physical operator during query planning and execution.

    - HashTable and HashJoin changes to support RowKeyJoin and Index Intersection.

    - Enhance Materializer to keep track of subscan association with a particular rowkey join.

  4. Index Planning utilities

    - Utility classes to perform RexNode analysis, including conversion to and from SchemaPath.

    - Utility class to analyze filter condition and an input collation to determine output collation.

    - Helper classes to maintain index contexts for logical and physical planning phase.

    - IndexPlanUtils utility class for various helper methods.

  5. Miscellaneous

    - Separate physical rel for DirectScan.

    - Modify LimitExchangeTranspose rule to handle SingleMergeExchange.

- MD-3880: Return correct status from RangePartitionRecordBatch setupNewSchema

Co-authored-by: Aman Sinha <asinha@maprtech.com>

Co-authored-by: chunhui-shi <cshi@maprtech.com>

Co-authored-by: Gautam Parai <gparai@maprtech.com>

Co-authored-by: Padma Penumarthy <ppenumar97@yahoo.com>

Co-authored-by: Hanumath Rao Maduri <hmaduri@maprtech.com>

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/HashJoinPOP.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTable.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTableTemplate.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinBatch.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillMergeProjectRule.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillOptiq.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjectIntoScanRule.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillScanRel.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/BroadcastExchangePrel.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/DrillDistributionTrait.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PrelUtil.java

exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java

exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetPushDownFilter.java

exec/java-exec/src/main/resources/drill-module.conf

logical/src/main/java/org/apache/drill/common/logical/StoragePluginConfig.java

Resolve merge comflicts and compilation issues.

  1. … 79 more files in changeset.
DRILL-6381: (Part 4) Enhance MapR-DB plugin to support querying secondary indexes

  1. Implementation of the index descriptor for MapR-DB.

2. MapR-DB specific costing for covering and non-covering indexes.

3. Discovery componenent to discover the indexes available for a MapR-DB table including CAST functional indexes.

4. Utility functions to build a canonical index descriptor.

5. Statistics: fetch and initialize statistcs from MapR-DB for a query condition. Maintain a query-scoped cache for the statistics. Utility functions to compute selectivity.

6. Range Partitioning: partitioning function that takes into account the tablet map to find out where a particular rowkey belongs.

7. Restricted Scan: support doing restricted (i.e skip) scan through lookups on the rowkey. Added a group-scan and record reader for this.

8. MD-3726: Simple Order by queries (without limit) when an index is used are showing regression.

9. MD-3995: Do not pushdown limit 0 past project with CONVERT_FROMJSON

10. MD-4259 : Account for limit during hashcode computation

Co-authored-by: Aman Sinha <asinha@maprtech.com>

Co-authored-by: chunhui-shi <cshi@maprtech.com>

Co-authored-by: Gautam Parai <gparai@maprtech.com>

Co-authored-by: Padma Penumarthy <ppenumar97@yahoo.com>

Co-authored-by: Hanumath Rao Maduri <hmaduri@maprtech.com>

Conflicts:

contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBFormatMatcher.java

contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBPushProjectIntoScan.java

contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/JsonTableGroupScan.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/index/rules/DbScanSortRemovalRule.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SortPrel.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/TopNPrel.java

Fix additional compilation issues.

  1. … 16 more files in changeset.
DRILL-6231: Fix memory allocation for repeated list vector

closes #1171