drill

Clone Tools
  • last updated 11 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-6278: Removed temp codegen directory in testing framework.

close apache/drill#1178

DRILL-6280: Cleanup execution of BuildTimeScan during maven build

closes #1177

edit content and add syntax highlighting info

    • binary
    /_docs/img/ctas-1.png
    • binary
    /_docs/img/ctas-2.png
    • binary
    /_docs/img/query-1.png
    • binary
    /_docs/img/query-2.png
    • binary
    /_docs/img/storagep-1.png
    • binary
    /_docs/img/storagep-2.png
    • -2
    • +2
    /_docs/install/045-distributed-mode-prerequisites.md
    • -4
    • +4
    /_docs/install/050-starting-drill-in-distributed-mode.md
DRILL-6284: Add operator metrics for batch sizing for flatten

DRILL-6331: Revisit Hive Drill native parquet implementation to be exposed to Drill optimizations (filter / limit push down, count to direct scan)

1. Factored out common logic for Drill parquet reader and Hive Drill native parquet readers: AbstractParquetGroupScan, AbstractParquetRowGroupScan, AbstractParquetScanBatchCreator.

2. Rules that worked previously only with ParquetGroupScan, now can be applied for any class that extends AbstractParquetGroupScan: DrillFilterItemStarReWriterRule, ParquetPruneScanRule, PruneScanRule.

3. Hive populated partition values based on information returned from Hive metastore. Drill populates partition values based on path difference between selection root and actual file path.

Before ColumnExplorer populated partition values based on Drill approach. Since now ColumnExplorer populates values for parquet files from Hive tables,

`populateImplicitColumns` method logic was changed to populated partition columns only based on given partition values.

4. Refactored ParquetPartitionDescriptor to be responsible for populating partition values rather than storing this logic in parquet group scan class.

5. Metadata class was moved to separate metadata package (org.apache.drill.exec.store.parquet.metadata). Factored out several inner classed to improve code readability.

6. Collected all Drill native parquet reader unit tests into one class TestHiveDrillNativeParquetReader, also added new tests to cover new functionality.

7. Reduced excessive logging when parquet files metadata is read

closes #1214

  1. … 50 more files in changeset.
DRILL-6248: Added limit push down support for system tables

1. PojoRecordReader started returning data in batches instead of returing all in one batch. Default batch size is 4000.

2. SystemTableScan supports limit push down while extrating data in record reader based on given max records to read.

3. Profiles and profiles_json tables apply limit push down while extracting data from store accessing profiles by range.

closes #1183

edits

    • -2
    • +2
    /_docs/install/050-starting-drill-in-distributed-mode.md
updated data file

add cgroup doc for DoY

    • -0
    • +94
    /_docs/drill-on-yarn/094-appendix-e-using-cgroups-to-control-cpu-usage.md
    • -5
    • +5
    /_docs/install/050-starting-drill-in-distributed-mode.md
DRILL-4286 doc for graceful shutdown feature

    • -6
    • +47
    /_docs/install/050-starting-drill-in-distributed-mode.md
DRILL-6275: Fixed direct memory reporting in sys.memory.

closes #1176

DRILL-6323: Lateral Join - Lateral Join Batch Memory manager support using the record batch sizer

apache drill 1.13 updates

Doc and website updates for the 1.13 release

    • -0
    • +4
    /_docs/031-drill-on-yarn.md
    • -0
    • +47
    /_docs/drill-on-yarn/010-drill-on-yarn-introduction.md
    • -0
    • +237
    /_docs/drill-on-yarn/020-creating-a-basic-drill-cluster.md
    • -0
    • +30
    /_docs/drill-on-yarn/030-launch-drill-under-yarn.md
    • -0
    • +84
    /_docs/drill-on-yarn/040-configuration-reference.md
    • -0
    • +104
    /_docs/drill-on-yarn/050-drill-on-yarn-command-line-tool.md
    • -0
    • +83
    /_docs/drill-on-yarn/060-using-the-drill-on-yarn-web-ui.md
    • -0
    • +70
    /_docs/drill-on-yarn/070-multiple-drill-clusters.md
    • -0
    • +32
    /_docs/drill-on-yarn/080-enabling-web-ui-security.md
    • -0
    • +173
    /_docs/drill-on-yarn/090-appendix-a-release-note-issues.md
    • -0
    • +18
    /_docs/drill-on-yarn/091-appendix-b-drill-env.sh-settings.md
    • -0
    • +96
    /_docs/drill-on-yarn/092-appendix-c-troubleshooting.md
    • -0
    • +31
    /_docs/drill-on-yarn/093-appendix-d-recreate-the-drill-archive.md
  1. … 8 more files in changeset.
DRILL-6250: Sqlline start command with password appears in the sqlline.log

closes #1174

DRILL-6262: IndexOutOfBoundException in RecordBatchSize for empty variableWidthVector

closes #1175

DRILL-6381: (Part 3) Planner and Execution implementation to support Secondary Indexes

  1. Index Planning Rules and Plan generators

    - DbScanToIndexScanRule: Top level physical planning rule that drives index planning for several relational algebra patterns.

- DbScanSortRemovalRule: Physical planning rule for index planning for Sort-based operations.

    - Plan Generators: Covering, Non-Covering and Intersect physical plan generators.

    - Support planning with functional indexes such as CAST functions.

    - Enhance PlannerSettings with several configuration options for indexes.

  2. Index Selection and Statistics

    - An IndexSelector that support cost-based index selection of covering and non-covering indexes using statistics and collation properties.

    - Costing of index intersection for comparison with single-index plans.

  3. Planning and execution operators

    - Support RangePartitioning physical operator during query planning and execution.

    - Support RowKeyJoin physical operator during query planning and execution.

    - HashTable and HashJoin changes to support RowKeyJoin and Index Intersection.

    - Enhance Materializer to keep track of subscan association with a particular rowkey join.

  4. Index Planning utilities

    - Utility classes to perform RexNode analysis, including conversion to and from SchemaPath.

    - Utility class to analyze filter condition and an input collation to determine output collation.

    - Helper classes to maintain index contexts for logical and physical planning phase.

    - IndexPlanUtils utility class for various helper methods.

  5. Miscellaneous

    - Separate physical rel for DirectScan.

    - Modify LimitExchangeTranspose rule to handle SingleMergeExchange.

- MD-3880: Return correct status from RangePartitionRecordBatch setupNewSchema

Co-authored-by: Aman Sinha <asinha@maprtech.com>

Co-authored-by: chunhui-shi <cshi@maprtech.com>

Co-authored-by: Gautam Parai <gparai@maprtech.com>

Co-authored-by: Padma Penumarthy <ppenumar97@yahoo.com>

Co-authored-by: Hanumath Rao Maduri <hmaduri@maprtech.com>

Conflicts:

exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/HashJoinPOP.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTable.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTableTemplate.java

exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinBatch.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillMergeProjectRule.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillOptiq.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjectIntoScanRule.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillScanRel.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/BroadcastExchangePrel.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/DrillDistributionTrait.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashJoinPrel.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PrelUtil.java

exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java

exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetPushDownFilter.java

exec/java-exec/src/main/resources/drill-module.conf

logical/src/main/java/org/apache/drill/common/logical/StoragePluginConfig.java

Resolve merge comflicts and compilation issues.

  1. … 79 more files in changeset.
DRILL-6381: (Part 4) Enhance MapR-DB plugin to support querying secondary indexes

  1. Implementation of the index descriptor for MapR-DB.

2. MapR-DB specific costing for covering and non-covering indexes.

3. Discovery componenent to discover the indexes available for a MapR-DB table including CAST functional indexes.

4. Utility functions to build a canonical index descriptor.

5. Statistics: fetch and initialize statistcs from MapR-DB for a query condition. Maintain a query-scoped cache for the statistics. Utility functions to compute selectivity.

6. Range Partitioning: partitioning function that takes into account the tablet map to find out where a particular rowkey belongs.

7. Restricted Scan: support doing restricted (i.e skip) scan through lookups on the rowkey. Added a group-scan and record reader for this.

8. MD-3726: Simple Order by queries (without limit) when an index is used are showing regression.

9. MD-3995: Do not pushdown limit 0 past project with CONVERT_FROMJSON

10. MD-4259 : Account for limit during hashcode computation

Co-authored-by: Aman Sinha <asinha@maprtech.com>

Co-authored-by: chunhui-shi <cshi@maprtech.com>

Co-authored-by: Gautam Parai <gparai@maprtech.com>

Co-authored-by: Padma Penumarthy <ppenumar97@yahoo.com>

Co-authored-by: Hanumath Rao Maduri <hmaduri@maprtech.com>

Conflicts:

contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBFormatMatcher.java

contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBPushProjectIntoScan.java

contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/JsonTableGroupScan.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/index/rules/DbScanSortRemovalRule.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SortPrel.java

exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/TopNPrel.java

Fix additional compilation issues.

  1. … 16 more files in changeset.
DRILL-6231: Fix memory allocation for repeated list vector

closes #1171

DRILL-6249: Adding more unit testing documentation.

close apache/drill#1251

    • -0
    • +4
    /docs/dev/BaseTestQuery.md
    • -0
    • +4
    /docs/dev/ClusterTest.md
    • -0
    • +51
    /docs/dev/GeneratedCode.md
    • -0
    • +107
    /docs/dev/InstantiatingComponents.md
    • -0
    • +34
    /docs/dev/LicenseHeaders.md
    • -0
    • +31
    /docs/dev/PhysicalOpUnitTestBase.md
    • -0
    • +112
    /docs/dev/RowSetFramework.md
    • -0
    • +17
    /docs/dev/TempDirectories.md
    • -0
    • +159
    /docs/dev/TestDataSets.md
  1. … 15 more files in changeset.
test

DRILL-6323: Lateral Join - Refactor BatchMemorySize to put outputBatchSize in abstract class. Created a new JoinBatchMemoryManager to be shared across join record batches. Changed merge join to use AbstractBinaryRecordBatch instead of AbstractRecordBatch, and use JoinBatchMemoryManager

DRILL-6243: Added alert box to confirm shutdown of drillbit after clicking shutdown button

closes #1169

Update version to 1.14.0-SNAPSHOT

    • -1
    • +1
    /contrib/data/tpch-sample-data/pom.xml
  1. … 16 more files in changeset.
DRILL-6241: Saffron properties config has the excessive permissions

changed saffron.properties permission to 640

closes #1167

DRILL-6016: Fix for Error reading INT96 created by Apache Spark

closes #1166

fix release from 1.13 to 1.14 for hash join spill support

doc updates for Drill 1.13

DRILL-6239: Add build and license badges to README.md

closes #1165

DRILL-6234: Improved documentation for VariableWidthVector mutators, and added simple unit tests demonstrating mutator behavior.

close apache/drill#1164