Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
DRILL-7337: Add vararg UDFs support

    • -38
    • +38
    ./ConstantExpressionIdentifier.java
  1. … 37 more files in changeset.
DRILL-6422: Replace guava imports with shaded ones

  1. … 980 more files in changeset.
DRILL-6656: Disallow extra semicolons and multiple statements on the same line.

closes #1415

  1. … 144 more files in changeset.
DRILL-6320: Fixed license headers.

closes #1207

  1. … 2057 more files in changeset.
DRILL-6094: Decimal data type enhancements

Add ExprVisitors for VARDECIMAL

Modify writers/readers to support VARDECIMAL

- Added usage of VarDecimal for parquet, hive, maprdb, jdbc;

- Added options to store decimals as int32 and int64 or fixed_len_byte_array or binary;

Add UDFs for VARDECIMAL data type

- modify type inference rules

- remove UDFs for obsolete DECIMAL types

Enable DECIMAL data type by default

Add unit tests for DECIMAL data type

Fix mapping for NLJ when literal with non-primitive type is used in join conditions

Refresh protobuf C++ source files

Changes in C++ files

Add support for decimal logical type in Avro.

Add support for date, time and timestamp logical types.

Update Avro version to 1.8.2.

  1. … 201 more files in changeset.
DRILL-6375 : Support for ANY_VALUE aggregate function

closes #1256

  1. … 36 more files in changeset.
DRILL-6259: Support parquet filter push down for complex types

close apache/drill#1173

  1. … 24 more files in changeset.
DRILL-6028: Allow splitting generated code in ChainedHashTable into blocks to avoid "code too large" error

1. Added new parameter seedValue to getHashBuild and getHashProbe methods in HashTableTemplate.

2. Generate logical expression for each key so its can be split into blocks if number of expressions in method exceeds upper limit.

3. ParameterExpression was added to generate reference to method parameter during code generation.

closes #1071

  1. … 12 more files in changeset.
DRILL-5116: Enable generated code debugging in each Drill operator

DRILL-5052 added the ability to debug generated code. The reviewer suggested

permitting the technique to be used for all Drill operators. This PR provides

the required fixes. Most were small changes, others dealt with the rather

clever way that the existing byte-code merge converted static nested classes

to non-static inner classes, with the way that constructors were inserted

at the byte-code level and so on. See the JIRA for the details.

This code passed the unit tests twice: once with the traditional byte-code

manipulations, a second time using "plain-old Java" code compilation.

Plain-old Java is turned off by default, but can be turned on for all

operators with a single config change: see the JIRA for info. Consider

the plain-old Java option to be experimental: very handy for debugging,

perhaps not quite tested enough for production use.

close apache/drill#716

  1. … 62 more files in changeset.
DRILL-5052: Option to debug generated Java code using an IDE

Provides a second compilation path for generated code: “plan old Java”

in which generated code inherit from their templates. Such code can be

compiled directly, allowing easy debugging of generated code.

Also show to generate two classes in the External Sort Batch as “plain

old Java” to enable IDE debugging of that generated code. Required

minor clean-up of the templates.

Fixes some broken toString( ) methods in code generation classes

Fixes a variety of small compilation warnings

Adds Java doc to a few classes

Includes clean-up from code review comments.

close apache/drill#660

  1. … 31 more files in changeset.
DRILL-1385, along with some cleanup

Cleaned up option handling. This includes using finals, making member variables

private whenever possible, and some formatting.

- fixed a bug in the string formatting for the double range validator

- OptionValidator, OptionValue, and their implementations now conspire not to

allow the creation of malformed options because the OptionType has been added

to validator calls to handle OptionValues that are created on demand.

Started with updated byte code rewrite from Jacques

Fixed several problems with scalar value replacement:

- use consistent ASM api version throughout

- stop using deprecated ASM methods (actually causes bugs)

- visitMethodInsn()

- added a couple of missing super.visitEnd()s

- fixed a couple of minor FindBugs issues

- accounted for required stack size increases when replacing holders for

longs and doubles

- added accounting for frame offsets to cope with long and double local

variables and value holder members

- fixed a few minor bugs found with FindBugs

- stop using carrotlabs' hash map lget() method on shared constant data

- fixed an incorrect use of DUP2 on objectrefs when copying long or double

holder members into locals

- fixed a problem with redundant POP instructions left behind after replacement

- fixed a problem with incorrect DUPs in multiple assignment statements

- fixed a problem with DUP_X1 replacement when handling constants in multiple

assignment statements

- fixed a problem with non-replaced holder member post-decrements

- don't replace holders passed to static functions as "out" parameters

(common with Accessors on repeated value vectors)

- increased the maximum required stack size when transferring holder members to

locals

- changed the code generation block type mappings for constants for external

sorts

- fixed problems handling constant and non-constant member variables in

operator classes

- in general, if a holder is assigned to or from an operator member variable,

it can't be replaced (at least not until we replace those as well)

- Use a derived ASM Analyzer (MethodAnalyzer) and Frame

(AssignmentTrackingFrame) in order to establish relationships between

assignments of holders through chains of local variables. This effectively

back-propagates non-replaceability attributes so that if a holder variable

that can't be replaced is assigned to from another holder variable, that

second one cannot be replaced either, and so on through longer chains of

assignments.

- code for dumping generated source code

- MergeAdapter dumps before and after results of scalar replacement

(if it's on)

- fixed some problems in ReplacingBasicValue by replacing HashSet with

IdentityHashMap

- made loggers private

- added a retry strategy for scalar replacement

if a scalar replacement code rewriting fails, then this will try to

regenerate the bytecode again without the scalar replacement.

- bytecode verification is always on now (required for the retry strategy)

- use system option to determine whether scalar replacement should be used

- default option: if scalar replacement fails, retry without it

- force replacement on or off

- unit tests for the retry strategy are based on a single known failure case,

covered by DRILL-2326.

- add tests TestConvertFunctions to test the three scalar replacement options

for the failing test case (testVarCharReturnTripConvertLogical)

- made it possible to set a SYSTEM option as a java property in Drillbit

- added a command line argument to force scalar replacement to be on during

testing in the rootmost pom.xml

In the course of this, added increased checking of intermediate stages of code

rewriting, as well as logging of classes that cause failures.

- work around a bug in ASM's CheckClassAdapter that doesn't allow for checking

of inner classes

Added comments, tidied up formatting, and added "final" in a number of places.

Signed-off-by: vkorukanti <venki.korukanti@gmail.com>

  1. … 53 more files in changeset.
DRILL-1904 - Part 1: Package level docs for the common module and a few packages in exec.

Fix line wrapping on package level docs to standardize at 80 character wrap.

Patch updated to address review comments.

Patch updated again to address Sudheesh's review comments.

  1. … 28 more files in changeset.
DRILL-1402: Add check-style rules for trailing space, TABs and blocks without braces

  1. … 439 more files in changeset.
DRILL-634: Cleanup/organize Java imports and trailing whitespaces from Drill code

    • -12
    • +12
    ./ConstantExpressionIdentifier.java
  1. … 766 more files in changeset.
Switch to DrillBuf Add @Inject DrillBuf Move comparison functions to memory sensitive ones Add scalar replacement functionality for value holders Simplify date parsing function Add local compiled code caching

  1. … 210 more files in changeset.
DRILL-1044: Optimize boolean and/or operators by short-circuit and fast-success / fast-fail approach.

  1. … 40 more files in changeset.
DRILL-968: Use checkstyle plugin to prevent inadvertent use of shaded Guava classes

+ Disallow non-static '*' imports in handwritten code.

+ Updated the current code to be in compliance.

+ Run 'rat' plugin in 'validate' phase.

  1. … 103 more files in changeset.
Remove references to jcommander's copy of Guava's Lists class.

  1. … 32 more files in changeset.
DRILL-565: Remove spurious insertion into IdentityHashMap in ConstantExpressionIdentifier.visitFloatConstant()

DRILL-332: Support for decimal data type

    • -0
    • +24
    ./ConstantExpressionIdentifier.java
  1. … 62 more files in changeset.
Build the IfExpression correcty in ExpressionStringBuilder Use IntExpression instead of LongExpression for integers

  1. … 4 more files in changeset.
DRILL-356: Changes to support date type

    • -0
    • +30
    ./ConstantExpressionIdentifier.java
  1. … 54 more files in changeset.
DRILL-452: Conversion functions for external data types

  1. … 57 more files in changeset.
DRILL-429: Remove extraneous casts.

Change cast from a function call to a new, separate type of expression.

Add test to confirm functionality.

  1. … 13 more files in changeset.
DRILL-335: Implement Hash Aggregation

1. Implementation of the hash aggregation execution operator - this has two main parts: the HashAggTemplate and the HashAggBatch.

2. Implementation of a hash table which is used by the hash aggregation. The hash table hash two main parts: the HashTableTemplate and the ChainedHashTable. The hash table internally uses the notion of 'BatchHolder' to keep track of all keys that can fit within one batch of 64K values. New BatchHolder objects are created as needed. Each BatchHolder has its own vector container. The HashAggregate also has a similar structure and it keeps track of the workspace variables.

(NOTE: An initial design document for the hash aggregation and hash table was already attached with Drill-335. The document has not yet been updated with the latest implementation ... will try to do that in the near future).

3. Jinfeng's changes to use workspace vectors in the generated code for aggregate functions (previously, for streaming aggregate we only needed to maintain workspace variable for 1 running group; however for hash aggregate we need to maintain it for all groups).

4. Fix for Drill-318: because of #3 above, the previous fix for Drill-318 is not valid anymore. I modified the template generation code for the aggregate functions such that they conform to the new infrastructure.

5. The original AggTemplate, AggBatch and Aggregator classes have been moved to corresponding StreamingAggTemplate, StreamingAggBatch and StreamingAggregator in order to differentiate it from hash aggregation. These appear as new files but the code there has not changed.

I have run several tests manually as part of TestHashAggr...these tests use TPC-H data and in particular a relatively large 'Orders' table. However, I have not yet packaged the tests to run as part of JUnit since the location and size of the parquet files needs to be figured out. I will continue to work on that.

  1. … 47 more files in changeset.
DRILL-387: Support using Hive simple UDFs in Drill * As part of this change FunctionDefinition (and related code) * are deleted, instead the same information available in * Function Holders are used whenever required * Freemarker/CodeModel codegen for Drill ObjectInspectors * Comparator function cleanup

  1. … 85 more files in changeset.
DRILL-346: Move constant expressions to setup.

  1. … 16 more files in changeset.
DRILL-366, DRILL-364: Small fixes due to merge issues

  1. … 5 more files in changeset.
DRILL-665: Handle null values in case expressions.

Signed-off-by: vkorukanti <venki.korukanti@gmail.com>

    • -0
    • +12
    ./ConstantExpressionIdentifier.java
  1. … 12 more files in changeset.
DRILL-363: Custom null handling in hash functions

Added custom null handling for hash functions, to override the privies NULL_IF_NULL handling. Added Integer and Float literals to the logical expression package and corresponding methods in the various visitor implementations. Also added new Hash functions for double and float values based on the doubleToLongBits and floatToIntBits methods in the corresponding boxed primitive classes together with the previously used murmur3_128 hash algorithm used for plain Ints and Longs.

Signed-off-by: Jacques Nadeau <jacques@apache.org>

    • -1
    • +13
    ./ConstantExpressionIdentifier.java
  1. … 13 more files in changeset.