drill

Clone Tools
  • last updated 19 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7125: REFRESH TABLE METADATA fails after upgrade from Drill 1.13.0 to Drill 1.15.0

DRILL-7118: Filter not getting pushed down on MapR-DB tables.

closes #1708

DRILL-7124: Fix logger for ShowFilesHandler

closes #1705

DRILL-7113: Fix creation of filter conditions for IS NULL and IS NOT NULL for MapR-DB format plugin

close apache/drill#1704

DRILL-7115: Improve Hive schema show tables performance

1. To make SHOW TABLES for Hive schema work much faster, additional Drill

feature of showing only accesible tables when Storage-Based authorization

is enabled was sacrificed. Now the behaviour matches to Hive/Beeline, all

tables will be shown despite of accessibility. For details about previous

show tables results, check description of DRILL-540.

2. In HiveDatabaseSchema implemented faster getTableNamesAndTypes() method

and removed bulk related code.

3. Deprecated bulk related options and removed bulk code from AbstractSchema,

DrillHiveMetastoreClient.

4. For 8000 Hive tables query returned in 1.8 seconds, for combination of

4000 tables and 8000 views query returned in 2.3 seconds. Note, that

after first query table names will be cached and next queries will perform

in less than 1 sec.

5. Refactored WorkspaceSchemaFactory's getTableNamesAndTypes()

method to reuse existing getViews() method.

6. DrillHiveMetastoreClient was refactored. Classes were unnested and enclosed

within client package with restricted visibility. Also was updated cache

values type to avoid unnecessarry List to Set back and forth conversions.

Client creation methods moved to separate class. So the new package

exposes only factory and client class.

closes #1706

  1. … 6 more files in changeset.
DRILL-7148: Use improved join cardinality and ndv estimation with statistics

closes #1744

DRILL-7155: Create a standard logging message for batch sizes generated by individual operators. This is needed for QA verification of the Batch Size feature DRILL-6238. closes #1716

DRILL-6965: Implement schema table function parameter

1. Added common schema table function parameter with can be used as single unit or with format plugin table function parameters.

2. Allowed creating schema without columns, in case if user needs only to indicate table properties.

3. Added unit tests.

closes #1777

  1. … 16 more files in changeset.
DRILL-7089: Implement caching for TableMetadataProvider at query level and adapt statistics to use Drill metastore API

closes #1728

  1. … 35 more files in changeset.
DRILL-7109: Apply selectivity calculations to single column filter predicates

closes #1701

DRILL-7111: Fix table function execution for directories

closes #1700

DRILL-7011: Support schema in scan framework

* Adds schema support to the row set-based scan framework and to the "V3" text reader based on that framework.

* Adding the schema made clear that passing options as a long list of constructor arguments was not sustainable. Refactored code to use a builder pattern instead.

* Added support for default values in the "null column loader", which required adding a "setValue" method to the column accessors.

* Added unit tests for all new or changed functionality. See TestCsvWithSchema for the overall test of the entire integrated mechanism.

* Added tests for explicit projection with schema

* Better handling of date/time in column accessors

* Converted recent column metadata work from Java 8 date/time to Joda.

* Added more CSV-with-schema unit tests

* Removed the ID fields from "resolved columns", used "instanceof" instead.

* Added wildcard projection with an output schema. Handles both "lenient" and "strict" schemas.

* Tagged projection columns with their output schema, when available.

* Scan projection added modes for wildcard with an output schema. The reader projection added support for merging reader and output schemas.

* Includes refactoring of scan operator tests (the test file grew too large.)

* Renamed some classes to avoid confusing reader schemas with output schemas.

* Added unit tests for the new functionality.

* Added "lenient" wildcard with schema test for CSV

* Added more type conversions: string-to-bit, many-to-string

* Fixed bug in column writer for VarDecimal

* Added missing unit tests, and fixed bugs, in Bit column reader/writer

* Cleaned up a number of unneded "SuppressWarnings"

closes #1711

  1. … 210 more files in changeset.
DRILL-7106: Fix Intellij warning for FieldSchemaNegotiator

closes #1698

DRILL-7105: Error while building the Drill native client

Added a compiler option in CMakeLists.txt to support the ISO C++ 2011 standard.

Also, changed the CMake min version to 3.1.3 to match the min version specified in protobuf.

closes #1697

DRILL-7107 Unable to connect to Drill 1.15 through ZK closes #1702

DRILL-7095: Expose table schema (TupleMetadata) to physical operator (EasySubScan)

1. Add system / session option store.table.use_schema_file to control if file schema can be used during query execution. False by default.

2. Added methods in StoragePlugin interface which allow to create Group Scan with provided table schema.

3. EasyGroupScan and EasySubScan now contain table schema, also they are able to serialize / deserialize it along with other scan properties.

4. DrillTable which is the main entry point for schema provisioning, has method to store schema and later uses it to create physical scan.

5. WorkspaceSchema when returning Drill table instance will get table schema from table root if available and if store.table.use_schema_file is set to true.

This PR is the next step for Schema Provisioning project which currently exposes schema only for text reader.

closes #1696

DRILL-7103: Incorrect conversion to integer in BsonRecordReader#writeTimeStamp

closes #1695

DRILL-7061: Disable LIMIT Rows Option

Freemarker by default introduces a comma in numeric values greater than 999. This corrects that by removing the ',' in the default limit size.

However, since a Server-side implementation is in progress (DRILL-6960 and DRILL-7048), it is best to disable this for now. The latest commit in this will hide those capabilities in the WebUI until the server-side feature goes in.

X

closes #1689

DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

closes #1699

DRILL-7100: Fixed IllegalArgumentException when reading Parquet data

DRILL-6707: Removed changes for setOutputRowCount. Modified LateralJoin to use new setCurrentOutgoingMaxRowCount api Limit CurrentOutgoingMaxRowCount to MAX_NUM_ROWS Fix HashJoin to fix failing tests

closes #1650

DRILL-7021: HTTPD Throws NPE and Doesn't Recognize Timeformat

DRILL-7092: Rename map to struct in schema definition 1. Renamed map to struct in schema parser. 2. Updated sqlTypeOf function to return STRUCT instead of MAP, drillTypeOf function will return MAP as before until internal renaming is done. 3. Add is_struct alias to already existing is_map function. Function should be revisited once Drill supports true maps. 4. Updated unit tests.

closes #1688

DRILL-2326: Fix scalar replacement for the case when static method which does not return values is called

- Fix check for return function value to handle the case when created object is returned without assigning it to the local variable

closes #1687

DRILL-7086: Output schema for row set mechanism

Enhances the row set mechanism to take an "output schema" that describes the vectors to

create. The "input schema" describes the type that the reader would like to write. A

conversion mechanism inserts a conversion shim to convert from the input to output type.

Provides a set of implicit type conversions, including string-to-date/time conversions

which use the new format property stored in column metadata. Includes unit tests for

the new functionality.

closes #1690

  1. … 51 more files in changeset.
DRILL-6524: Assign holder fields instead of assigning object references in generated code to allow scalar replacement for more cases closes #1686

DRILL-6524: Prevent incorrect scalar replacement for the case of assigning references inside if block

DRILL-7085: Fix table-path check in AnalyzeTableHandler

closes #1685

DRILL-6852: Adapt current Parquet Metadata cache implementation to use Drill Metastore API

Co-authored-by: Volodymyr Vysotskyi <vvovyk@gmail.com>

Co-authored-by: Vitalii Diravka <vitalii@apache.org>

close apache/drill#1646

  1. … 52 more files in changeset.
DRILL-7014: Format plugin for LTSV files closes #1627

    • -0
    • +38
    /contrib/format-ltsv/README.md
    • -0
    • +56
    /contrib/format-ltsv/pom.xml
    • -0
    • +4
    /contrib/format-ltsv/src/test/resources/emptylines.ltsv
    • -0
    • +1
    /contrib/format-ltsv/src/test/resources/invalid.ltsv
    • -0
    • +2
    /contrib/format-ltsv/src/test/resources/simple.ltsv