Clone Tools
  • last updated 22 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7326: Support repeated lists for CTAS parquet format

closes #1844

    • -0
    • +2
    ./jsoninput/repeated_list_of_maps.json
  1. … 4 more files in changeset.
DRILL-4517: Support reading empty Parquet files

1. Modified flat and complex parquet readers to output schema only when requested number of records to read is 0. In this case readers are not initialized to improve performance.

2. Allowed reading requested number of rows instead of all rows in the row group (DRILL-6528).

3. Fixed issue with nulls number determination in the row group (fixed IsPredicate#isAllNulls method).

4. Allowed reading empty parquet files via adding empty / fake row group.

5. General refactoring and unit tests.

6. Parquet tests categorization.

closes #1839

    • binary
    ./parquet/empty/complex/empty_complex.parquet
    • binary
    ./parquet/empty/complex/non_empty_complex.parquet
    • binary
    ./parquet/empty/simple/empty_simple.parquet
    • binary
    ./parquet/empty/simple/non_empty_simple.parquet
  1. … 45 more files in changeset.
DRILL-7337: Add vararg UDFs support

    • -0
    • +28
    ./org/apache/drill/CompileClassWithArraysAssignment.java
  1. … 37 more files in changeset.
DRILL-7327: Log Regex Plugin Won't Recognize Schema

The previous commit revised the plugin config classes to work

with table functions. That caused Jackson to stop working for

the classess. Fixed those issues and added unit tests.

closes #1827

    • -0
    • +396
    ./regex/firewall.ssdlog
  1. … 4 more files in changeset.
DRILL-7292: Remove V1 and V2 text readers

Drill 1.16 introduced the "V2" text reader based on the row set

and provided schema mechanisms. V3 was available by system/session

option as the functionality was considered experimental.

The functionality has now undergone thorough testing. This commit makes

the V3 text reader available by default, and removes the code for the

original "V1" and the "new" (compliant, "V2") text reader.

The system/session options that controlled reader selection are retained

for backward compatibility, but they no longer do anything.

Specific changes:

* Removed the V2 "compliant" text reader.

* Moved the "V3" to replace the "compliant" version.

* Renamed the "complaint" package to "reader."

* Removed the V1 text reader.

* Moved the V1 text writer (still used with the V2 and V3 readers)

into a new "writer" package adjacent to the reader.

* Removed the CSV tests for the V2 reader, including those that

demonstrated bugs in V2.

* V2 did not properly handle the quote escape character. One or two unit

tests depended on the broken behavior. Fixed them for the correct

behavior.

* Behavior of "messy quotes" (those that appear in a non-quoted field)

was undefined for the text reader. Added a test to clearly demonstrate

the (somewhat odd) behavior. The behavior itself was not changed.

Reran all unit tests to ensure that they work with the now-default V3

text reader.

closes #1806

  1. … 59 more files in changeset.
DRILL-7268: Read Hive array with parquet native reader

1. Fixed preserving of group originalType for projected schema

in DrillParquetReader

2. Added reading of LIST logical type to DrillParquetGroupConverter.

Intermediate noop converter used to skip writing for next nested

repeated field after recognition of parent field as LIST. For this

skipRepeated 'true' passed to child converter's constructor.

close apache/drill#1805

    • binary
    ./parquet2/hive_arrays_p.parquet
  1. … 6 more files in changeset.
DRILL-7196: Queries are still runnable on disabled plugins

- Storage client is not created anymore for disabled plugins

- GET "/storage/{name}.json" endpoint now working with

plugin configuration directly, without client instantination.

It have increased UI responsitivity.

- Hbase and mongo base test classes refactored to honor enabled

plugin attribute

- Fixed path contructor for mongo test datasets:

Now it is cross-platform

- Fixed test json files format which using plugin definitions

- Code cleanup

    • -35
    • +32
    ./decimal/cast_decimal_float.json
    • -54
    • +52
    ./decimal/cast_decimal_vardecimal.json
    • -42
    • +32
    ./decimal/cast_float_decimal.json
    • -42
    • +35
    ./decimal/cast_simple_decimal.json
    • -38
    • +35
    ./decimal/cast_vardecimal_decimal.json
    • -53
    • +49
    ./decimal/simple_decimal_arithmetic.json
  1. … 92 more files in changeset.
DRILL-7032: Ignore corrupt rows in a PCAP file

closes #1637

    • binary
    ./store/pcap/testv1.pcap
  1. … 4 more files in changeset.
DRILL-7096: Develop vector for canonical Map<K,V>

- Added new type DICT;

- Created value vectors for the type for single and repeated modes;

- Implemented corresponding FieldReaders and FieldWriters;

- Made changes in EvaluationVisitor to be able to read values from the map by key;

- Made changes to DrillParquetGroupConverter to be able to read Parquet's MAP type;

- Added an option `store.parquet.reader.enable_map_support` to disable reading MAP type as DICT from Parquet files;

- Updated AvroRecordReader to use new DICT type for Avro's MAP;

- Added support of the new type to ParquetRecordWriter.

    • binary
    ./store/parquet/complex/map/parquet/000000_0.parquet
    • binary
    ./store/parquet/complex/simple_map.parquet
  1. … 107 more files in changeset.
DRILL-7021: HTTPD Throws NPE and Doesn't Recognize Timeformat

    • -0
    • +10
    ./httpd/hackers-access-small.httpd
  1. … 9 more files in changeset.
DRILL-6524: Prevent incorrect scalar replacement for the case of assigning references inside if block

    • -0
    • +42
    ./org/apache/drill/CompileClassWithIfs.java
  1. … 6 more files in changeset.
DRILL-7068: Support memory adjustment framework for resource management with Queues. closes #1677

    • -3
    • +9
    ./store/json/project_pushdown_json_physical_plan.json
  1. … 34 more files in changeset.
DRILL-4858: REPEATED_COUNT on an array of maps and an array of arrays is not implemented

- Implemented 'repeated_count' function for repeated MAP and repeated LIST;

- Updated RepeatedListReader and RepeatedMapReader implementations to return correct value from size() method

- Moved repeated_count to freemarker template and added support for more repeated types for the function

closes #1641

    • -0
    • +8
    ./functions/repeated/repeated_list.json
    • -0
    • +7
    ./functions/repeated/repeated_map.json
    • binary
    ./store/parquet/complex/repeated_types.parquet
  1. … 6 more files in changeset.
DRILL-7019: Add check for redundant imports

close apache/drill#1629

  1. … 22 more files in changeset.
DRILL-6970 Fix issue with logregex format plugin where drillbuf was overflowing

closes #1673

    • -0
    • +9990
    ./regex/large.log1
  1. … 2 more files in changeset.
DRILL-6944: UnsupportedOperationException thrown for view over MapR-DB binary table

1. Added persistence of MAP key and value types in Drill views (affects .view.drill file) for avoiding cast problems in future.

2. Preserved backward compatibility of older view files by treating untyped maps as ANY.

closes #1602

    • binary
    ./avro/map_string_to_long.avro
    • -0
    • +10
    ./view/vw_before_drill_6944.view.drill
  1. … 4 more files in changeset.
DRILL-6887: Fix FunctionInitializerTest.init() failure

close apache/drill#1566

DRILL-6751: Upgrade Apache parent POM to version 21

- Update apache.pom file version to 21 (with updating some maven plugins versions)

- Include Drill's sources jars on assembly stage in <moduleSets> (not <dependencySets>)

for properincluding jars with last apache-21.pom

- Separate "distro-assembly" to the two execution stages to avoid:

[WARNING] Assembly file: <DRILL_HOME>/distribution/target/apache-drill-1.15.0-SNAPSHOT is not a regular

file (it may be a directory). It cannot be attached to the project build for installation or deployment.

- Remove unsused <include>/<exclude> in assebly descriptor to avoid:

[WARNING] The following patterns were never triggered in this artifact inclusion filter

- Update "maven-assembly-plugin" version

- Update "slf4j" version

- Update "mockito-core" version

- Update "bcpkix-jdk15on" (Bouncy Castle Cryptography APIs) version

close apache/drill#1561

  1. … 7 more files in changeset.
DRILL-6744: Support varchar and decimal push down

1. Added enableStringsSignedMinMax parquet format plugin config and store.parquet.reader.strings_signed_min_max session option to control reading binary statistics for files generated by prior versions of Parquet 1.10.0.

2. Added ParquetReaderConfig to store configuration needed during reading parquet statistics or files.

3. Provided mechanism to enable varchar / decimal filter push down.

4. Added VersionUtil to compare Drill versions in string representation.

5. Added appropriate unit tests.

closes #1537

    • binary
    ./parquet/decimal_gen_1_13_0/0_0_1.parquet
    • binary
    ./parquet/decimal_gen_1_13_0/0_0_2.parquet
    • binary
    ./parquet/varchar_gen_1_13_0/0_0_1.parquet
    • binary
    ./parquet/varchar_gen_1_13_0/0_0_2.parquet
  1. … 36 more files in changeset.
DRILL-6804: Simplify usage of OperatorPhase in HashAgg.

  1. … 3 more files in changeset.
DRILL-6768: Improve to_date, to_time and to_timestamp and corresponding cast functions to handle empty string when option is enabled closes #1494

    • -0
    • +3
    ./dateWithEmptyStrings.json
  1. … 24 more files in changeset.
DRILL-6764: Query fails with IOB when Unnest has reference to deep nested field like (t.c_orders.o_lineitems).

closes #1487

    • -0
    • +134
    ./lateraljoin/nested-customer-map.json
  1. … 2 more files in changeset.
DRILL-6752: Surround Drill quotes with double quotes

1. Surround Drill quotes with double quotes.

2. Remove drill-sqlline-test.conf, use drill-sqlline.conf for tests instead.

closes #1475

  1. … 3 more files in changeset.
DRILL-3853: Upgrade to SqlLine 1.5.0 closes #1462

    • -0
    • +30
    ./drill-sqlline-test-override.conf
    • -0
    • +40
    ./drill-sqlline-test.conf
  1. … 14 more files in changeset.
DRILL-6349: Drill JDBC driver fails on Java 1.9+ with NoClassDefFoundError: sun/misc/VM

closes #1446

  1. … 23 more files in changeset.
DRILL-6685: Fixed exception when reading Parquet data

    • binary
    ./parquet/fourvarchar_asc_nulls.parquet
  1. … 4 more files in changeset.
DRILL-6670: Align Parquet TIMESTAMP_MICROS logical type handling with earlier versions + minor fixes

closes #1428

    • binary
    ./parquet/parquet_logical_types_simple.parquet
    • binary
    ./parquet/parquet_logical_types_simple_nodict.parquet
    • binary
    ./parquet/parquet_logical_types_simple_nullable.parquet
    • binary
    ./parquet/parquet_logical_types_simple_nullable_nodict.parquet
    • binary
    ./store/parquet/complex/parquet_logical_types_complex_nodict.parquet
    • binary
    ./store/parquet/complex/parquet_logical_types_complex_nullable_nodict.parquet
  1. … 8 more files in changeset.
DRILL-6104: Add Log/Regex Format Plugin

closes #1114

  1. … 9 more files in changeset.
DRILL-6603: Set num_nulls for parquet statistics to -1 when actual number is not defined.

    • binary
    ./parquet/wide_string.parquet
  1. … 8 more files in changeset.
DRILL-6472: Prevent using zero precision in CAST function

- Add check for the correctness of scale value;

- Add check for fitting the value to the value with the concrete scale and precision;

- Implement negative UDF for VarDecimal

- Add unit tests for new checks and UDF.

    • -4
    • +4
    ./decimal/cast_decimal_vardecimal.json
    • -3
    • +3
    ./decimal/cast_vardecimal_decimal.json
  1. … 10 more files in changeset.