Clone Tools
  • last updated 23 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7759: Code compilation exception for queries containing (untyped) NULL

    • -0
    • +56
    ./physical_untyped_null.json
  1. … 3 more files in changeset.
DRILL-7738: Fix TestDynamicUDFSupport failure for GitHub Actions

DRILL-7713: Upgrade misc libraries which outdated versions have reported vulnerabilities

1. Jackson

2. Retrofit

3. Commons-beanutils

4. Xalan

5. Xerdes

6. Commons-codec

7. Snakeyaml

8. Metadata-extractor

9. Protostuff

  1. … 9 more files in changeset.
DRILL-7330: Implement metadata usage for all format plugins

    • -0
    • +0
    ./store/text/directoryWithEmptyCSV/empty.csv
  1. … 59 more files in changeset.
DRILL-7592: Add missing licenses and update plugins exclusion list and fix licenses

closes #1989

  1. … 79 more files in changeset.
DRILL-7590: Refactor plugin registry

Major cleanup of the plugin registry to split it into components

in preparation for a proper plugin API.

Better coordinates the named and ephemeral plugin caches.

Cleans up the registry API. Sharpens rules for modifying

plugin configs.

closes #1988

    • -0
    • +22
    ./plugins/bogus-bootstrap.json
    • -0
    • +28
    ./plugins/dup-bootstrap.json
    • -0
    • +14
    ./plugins/mock-format-bootstrap.json
    • -0
    • +195
    ./plugins/mock-plugin-upgrade.json
  1. … 160 more files in changeset.
DRILL-7491: Incorrect count() returned for complex types in parquet

closes #1955

    • binary
    ./parquet/hive_all/hive_alltypes.parquet
  1. … 4 more files in changeset.
DRILL-7509: Incorrect TupleSchema is created for DICT column when querying Parquet files

    • binary
    ./store/parquet/complex/repeated_struct.parquet
  1. … 15 more files in changeset.
DRILL-7502: Invalid codegen for typeof() with UNION

Also fixes DRILL-6362: typeof() reports NULL for primitive

columns with a NULL value.

typeof() is meant to return "NULL" if a UNION has a NULL

value, but the column type when known, such as for non-UNION

columns.

Also fixes DRILL-7499: sqltypeof() function with an array returns

"ARRAY", not type. This was due to treating REPEATED like LIST.

Handling of the Union vector in code gen is problematic

with about three special cases. Existing code handled two

of the cases. This change handles the third case.

Figuring out the change required poking around quite a bit

of unclear code. Added comments and restructuring to make

that code a bit more clear.

The fix modified code gen for the Union Holder. It can now

"go back in time" to add the union reader at the point we

need it.

closes #1945

    • -0
    • +8
    ./jsoninput/allTypes.json
    • -0
    • +5
    ./jsoninput/union/c.json
  1. … 53 more files in changeset.
DRILL-7503: Refactor the project operator

Breaks the big "setup" function into its own class, and

separates out physical vector setup from logical projection

planning. No functional change; just rearranging existing

code.

closes #1944

  1. … 8 more files in changeset.
DRILL-7485: NPE on PCAP Batch Reader

closes #1932

    • binary
    ./store/pcap/arpWithNullIP.pcap
  1. … 3 more files in changeset.
DRILL-7484: Malware found in the Drill test folder

closes #1934

    • -0
    • +1
    ./store/pcap/dataFromRemote.txt
    • binary
    ./store/pcap/http.pcap
  1. … 1 more file in changeset.
DRILL-7473: Parquet reader failed to get field of repeated map

closes #1933

  1. … 5 more files in changeset.
DRILL-7443: Enable PCAP Plugin to Reassemble TCP Streams

closes #1898

    • binary
    ./store/pcap/attack-trace.pcap
  1. … 12 more files in changeset.
DRILL-7450: Improve performance for ANALYZE command

- Implement two-phase aggregation for the lowest metadata aggregate to optimize performance

- Allow using complex functions with hash aggregate

- Use hash aggregation for PHASE_1of2 for ANALYZE to reduce memory usage and avoid sorting non-aggregated data

- Add sort above hash aggregation to fix correctness of merge exchange and stream aggregate

closes #1907

  1. … 58 more files in changeset.
DRILL-7409: Moving test with huge test data to the drill-test-framework.

closes #1891

  1. … 1 more file in changeset.
DRILL-7413: Test and fix scan operator vectors

Enables vector validation tests for the ScanBatch and all

EasyFormat plugins. Fixes a bug in scan batch that failed to set

the record count in the output container.

Fixes a number of formatting and other issues found while adding

the tests.

  1. … 6 more files in changeset.
DRILL-7412: Minor unit test improvements

Many tests intentionally trigger errors. A debug-only log setting

sent those errors to stdout. The resulting stack dumps simply cluttered

the test output, so disabled error output to the console.

Drill can apply bounds checks to vectors. Tests run via Maven

enable bounds checking. Now, bounds checking is also enabled in

"debug mode" (when assertions are enabled, as in an IDE.)

Drill contains two test frameworks. The older BaseTestQuery was

marked as deprecated, but many tests still use it and are unlikely

to be changed soon. So, removed the deprecated marker to reduce the

number of spurious warnings.

Also includes a number of minor clean-ups.

closes #1876

  1. … 17 more files in changeset.
DRILL-7326: Support repeated lists for CTAS parquet format

closes #1844

    • -0
    • +2
    ./jsoninput/repeated_list_of_maps.json
  1. … 4 more files in changeset.
DRILL-4517: Support reading empty Parquet files

1. Modified flat and complex parquet readers to output schema only when requested number of records to read is 0. In this case readers are not initialized to improve performance.

2. Allowed reading requested number of rows instead of all rows in the row group (DRILL-6528).

3. Fixed issue with nulls number determination in the row group (fixed IsPredicate#isAllNulls method).

4. Allowed reading empty parquet files via adding empty / fake row group.

5. General refactoring and unit tests.

6. Parquet tests categorization.

closes #1839

    • binary
    ./parquet/empty/complex/empty_complex.parquet
    • binary
    ./parquet/empty/complex/non_empty_complex.parquet
    • binary
    ./parquet/empty/simple/empty_simple.parquet
    • binary
    ./parquet/empty/simple/non_empty_simple.parquet
  1. … 45 more files in changeset.
DRILL-7337: Add vararg UDFs support

    • -0
    • +28
    ./org/apache/drill/CompileClassWithArraysAssignment.java
  1. … 37 more files in changeset.
DRILL-7327: Log Regex Plugin Won't Recognize Schema

The previous commit revised the plugin config classes to work

with table functions. That caused Jackson to stop working for

the classess. Fixed those issues and added unit tests.

closes #1827

    • -0
    • +396
    ./regex/firewall.ssdlog
  1. … 4 more files in changeset.
DRILL-7292: Remove V1 and V2 text readers

Drill 1.16 introduced the "V2" text reader based on the row set

and provided schema mechanisms. V3 was available by system/session

option as the functionality was considered experimental.

The functionality has now undergone thorough testing. This commit makes

the V3 text reader available by default, and removes the code for the

original "V1" and the "new" (compliant, "V2") text reader.

The system/session options that controlled reader selection are retained

for backward compatibility, but they no longer do anything.

Specific changes:

* Removed the V2 "compliant" text reader.

* Moved the "V3" to replace the "compliant" version.

* Renamed the "complaint" package to "reader."

* Removed the V1 text reader.

* Moved the V1 text writer (still used with the V2 and V3 readers)

into a new "writer" package adjacent to the reader.

* Removed the CSV tests for the V2 reader, including those that

demonstrated bugs in V2.

* V2 did not properly handle the quote escape character. One or two unit

tests depended on the broken behavior. Fixed them for the correct

behavior.

* Behavior of "messy quotes" (those that appear in a non-quoted field)

was undefined for the text reader. Added a test to clearly demonstrate

the (somewhat odd) behavior. The behavior itself was not changed.

Reran all unit tests to ensure that they work with the now-default V3

text reader.

closes #1806

  1. … 59 more files in changeset.
DRILL-7268: Read Hive array with parquet native reader

1. Fixed preserving of group originalType for projected schema

in DrillParquetReader

2. Added reading of LIST logical type to DrillParquetGroupConverter.

Intermediate noop converter used to skip writing for next nested

repeated field after recognition of parent field as LIST. For this

skipRepeated 'true' passed to child converter's constructor.

close apache/drill#1805

    • binary
    ./parquet2/hive_arrays_p.parquet
  1. … 6 more files in changeset.
DRILL-7196: Queries are still runnable on disabled plugins

- Storage client is not created anymore for disabled plugins

- GET "/storage/{name}.json" endpoint now working with

plugin configuration directly, without client instantination.

It have increased UI responsitivity.

- Hbase and mongo base test classes refactored to honor enabled

plugin attribute

- Fixed path contructor for mongo test datasets:

Now it is cross-platform

- Fixed test json files format which using plugin definitions

- Code cleanup

    • -35
    • +32
    ./decimal/cast_decimal_float.json
    • -54
    • +52
    ./decimal/cast_decimal_vardecimal.json
    • -42
    • +32
    ./decimal/cast_float_decimal.json
    • -42
    • +35
    ./decimal/cast_simple_decimal.json
    • -38
    • +35
    ./decimal/cast_vardecimal_decimal.json
    • -53
    • +49
    ./decimal/simple_decimal_arithmetic.json
  1. … 92 more files in changeset.
DRILL-7032: Ignore corrupt rows in a PCAP file

closes #1637

    • binary
    ./store/pcap/testv1.pcap
  1. … 4 more files in changeset.
DRILL-7096: Develop vector for canonical Map<K,V>

- Added new type DICT;

- Created value vectors for the type for single and repeated modes;

- Implemented corresponding FieldReaders and FieldWriters;

- Made changes in EvaluationVisitor to be able to read values from the map by key;

- Made changes to DrillParquetGroupConverter to be able to read Parquet's MAP type;

- Added an option `store.parquet.reader.enable_map_support` to disable reading MAP type as DICT from Parquet files;

- Updated AvroRecordReader to use new DICT type for Avro's MAP;

- Added support of the new type to ParquetRecordWriter.

    • binary
    ./store/parquet/complex/map/parquet/000000_0.parquet
    • binary
    ./store/parquet/complex/simple_map.parquet
  1. … 107 more files in changeset.
DRILL-7021: HTTPD Throws NPE and Doesn't Recognize Timeformat

    • -0
    • +10
    ./httpd/hackers-access-small.httpd
  1. … 9 more files in changeset.
DRILL-6524: Prevent incorrect scalar replacement for the case of assigning references inside if block

    • -0
    • +42
    ./org/apache/drill/CompileClassWithIfs.java
  1. … 6 more files in changeset.
DRILL-7068: Support memory adjustment framework for resource management with Queues. closes #1677

    • -3
    • +9
    ./store/json/project_pushdown_json_physical_plan.json
  1. … 34 more files in changeset.