drill

Clone Tools
  • last updated 14 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-5603: Replace String file paths to Hadoop Path - replaced all String path representation with org.apache.hadoop.fs.Path - added PathSerDe.Se JSON serializer - refactoring of DFSPartitionLocation code by leveraging existing listPartitionValues() functionality

closes #1657

  1. … 69 more files in changeset.
DRILL-6582: SYSLOG (RFC-5424) Format Plugin closes #1530

    • -0
    • +41
    /contrib/format-syslog/README.md
    • -0
    • +89
    /contrib/format-syslog/pom.xml
    • -0
    • +8
    /contrib/format-syslog/src/test/resources/syslog/logs.syslog
    • -0
    • +8
    /contrib/format-syslog/src/test/resources/syslog/logs.syslog1
    • -0
    • +6
    /contrib/format-syslog/src/test/resources/syslog/logs1.syslog
    • -0
    • +1
    /contrib/format-syslog/src/test/resources/syslog/test.syslog
    • -0
    • +2
    /contrib/format-syslog/src/test/resources/syslog/test.syslog1
  1. … 6 more files in changeset.
DRILL-6734: JDBC storage plugin returns null for fields without aliases closes #1642 - Add output column names to JdbcRecordReader and use them for storing the results since column names in result set may differ when aliases aren't specified

DRILL-4858: REPEATED_COUNT on an array of maps and an array of arrays is not implemented

- Implemented 'repeated_count' function for repeated MAP and repeated LIST;

- Updated RepeatedListReader and RepeatedMapReader implementations to return correct value from size() method

- Moved repeated_count to freemarker template and added support for more repeated types for the function

closes #1641

DRILL-7117: Support creation of equi-depth histogram for selected data types.

Support int/bigint/float4/float8, time/timestamp/date and boolean.

Build the histogram from the t-digest byte array and serialize as JSON string.

More changes for serialization/deserialization.

Add code-gen stubs (empty) for VarChar/VarBinary types.

Address review comments (part 1). Add unit test.

Address review comments (part 2) for sampling.

close apache/drill#1715

change httpd storage plugin to format plugin

edit httpd page - change to format plugin vs storage

DRILL-7058: Refresh command to support subset of columns closes #1666

DRILL-7038: Queries on partitioned columns scan the entire datasets

- Added new optimizer rule which checks if query references directory columns only and has DISTINCT or GROUP BY operation. If the condition holds, instead of scanning full file set the following will be performed:

1) if there is cache metadata file, these directories will be read from it,

2) otherwise directories will be gathered from selection object (PartitionLocation).

In the end Scan node will be transformed to DrillValuesRel (containing constant literals) with gathered values so no scan will be performed.

closes #1640

DRILL-7022: Partition pruning is not happening the first time after the metadata auto-refresh

closes #1638

DRILL-7046: Support for loading and parsing new RM config file closes #1652

  1. … 49 more files in changeset.
edit cgroup doc

DRILL-7031: Add Travis job that runs protobuf generation command and checks if all protobufs are up-to-date

closes #1636

DRILL-6989: Upgrade to SqlLine 1.7

closes #1717

    • -1
    • +1
    /distribution/src/resources/sqlline.bat
edits to cgroup doc

    • -35
    • +37
    /_docs/configure-drill/121-configuring-cgroups-to-control-cpu-usage.md
DRILL-6780: Caching dependencies for CircleCI

closes #1632

edits

DRILL-7018: Fixed Parquet buffer overflow when reading timestamp column

close apache/drill#1630

DRILL-7019: Add check for redundant imports

close apache/drill#1629

  1. … 9 more files in changeset.
DRILL-7016: Wrong query result with RuntimeFilter enabled when order of join and filter condition is swapped

close apache/drill#1628

DRILL-7024: Refactor ColumnWriter to simplify type-conversion shim

DRILL-7006 added a type conversion "shim" within the row set framework. Basically, we insert a "shim" column writer that takes data in one form (String, say), and does reader-specific conversions to a target format (INT, say).

The code works fine, but the shim class ends up needing to override a bunch of methods which it then passes along to the base writer. This PR refactors the code so that the conversion shim is simpler.

closes #1633

  1. … 52 more files in changeset.
DRILL-7008: Drillbits: clear stale shutdown hooks

ShutdownThread is no longer required when Drillbit#close() is called.

mvn install for Drill project consumed 600MiB (there were 160 shutdown hooks)

close apache/drill#1625

DRILL-7007: Use verify method in row set tests

Many of the early RowSet-based tests used the pattern:

new RowSetComparison(expected)

.verifyAndClearAll(result);

Revise this to use the simplified form:

RowSetUtilities.verify(expected, result);

The original form is retained when tests use additional functionality, such as the ability to perform multiple verifications on the same expected batch.

closes #1624

DRILL-7006: Add type conversion to row writers

Modifies the column metadata and writer abstractions to allow a type conversion "shim" to be specified as part of the schema, then inserted as part of the row set writer. Allows, say, setting an Int or Date from a string, parsing the string to obtain the proper data type to store in the vector.

Type conversion not yet supported in the result set loader: some additional complexity needs to be resolved.

Adds unit tests for this functionality. Refactors some existing tests to remove rough edges.

closes #1623

DRILL-7002: whatever exec.hashjoin.num_partitions is set, output right results

close apache/drill#1622

doc updates

DRILL-6999: Fix the case that there's more than one join conditions

closes #1600

DRILL-7000: Queries failing with 'Failed to aggregate or route the RFW' do not complete

closes #1621

DRILL-6910: Allow applying DrillPushProjectIntoScanRule at physical phase

closes #1619

DRILL-6950: Row set-based scan framework

Adds the "plumbing" that connects the scan operator to the result set loader and the scan projection framework. See the various package-info.java files for the technical datails. Also adds a large number of tests.

This PR does not yet introduce an actual scan operator: that will follow in subsequent PRs.

closes #1618

  1. … 47 more files in changeset.