initial support for returning multiple versions and column timestamps

This feature is not yet externalized.

Support added to:

-- return multiple versions of rows

-- select * from table {versions N | MAX | ALL}

-- get hbase timestamp of a column

-- select hbase_timestamp(col) from t;

-- get version number of a column.

-- select hbase_version(col) from t

Change-Id: I37921681fc606a22c19d2c0cb87a35dee5491e1e

Move core into subdir to combine repos

Move core into subdir to combine repos

Move core into subdir to combine repos

Use: git log --follow -- <file>

to view file history thru renames.

Fixes for a few scalar UDF bugs

LP 1426605: change in NormRelExpr.cpp. When left linearizing a join backbone

sufficients inputs were not being provided. The change ensures that inputs

from the old tree and still marked as required inputs for a node in the new


LP 1420530: Error handling added to BiArith::bindNode.

LP 1420938: Error handling to CREATE FUNCTION statement to flag more than 32


LP 1421438: showddl [function | procedure | table_mapping function] <name>;

now works. If one of the optional tokens is not specified then we will look

for a table called <name>.

Patch Set 1

Changes to address comments by Dave.

One more fix in ExUdr.cpp. There is no LP for this bug. If a dll is missing

at runtime or other LOAD errors during UDF fixup could lead to an assertion,

since we try to place an error in UDF's up queue, before there are entries

in the corresponding down queue. Fix is to remove this line and let existing

error handling report this error. Thanks for your help Hans.

Couple of items that I forgot to mention before

1) Changes in Analyzer.cpp related to printing predecessorJBBC are due to Hans.

2) Showddl code is mostly refactored from previous versions.

Change-Id: Idfde89d73c47735c4405befa6b9cdd4ae0d2e641

Manageability changes - event mgmt and stats publication

Implements changes to support event management using log4cpp.

Configuration files are located in $MY_SQROOT/conf folder and all logs

files are located in $MY_SQROOT/logs folder

For more information see the blueprint at:

Implements changes for publication of statistics to repository. For more

information see the blueprint at:


In this initial delivery publication of statistics is disabled by

default and it can be enabled via DCS property. This code has been

reviewed internally prior to merging with mainline


Included timestamp to be part of the primarykey for metric aggregation


Addressed some of the comments and incorporated Anoop's change for


Changed the queryBuf size in sql/sqlcomp/CmpSeabaseDDLrepos.cpp to 20000

Modified the sql/regress/seabase/EXPECTED024

Change-Id: I517575233c10b2a8683cdd1d53a2eec96d7c2a6f

Support for divisioning (multi-temperature data)

This is the initial support for divisioning. See

blueprint cmp-divisioning for more information:

Also, this change fixes the following LaunchPad bugs:

Bug 1388458 insert using primary key default value into a salted

table asserts in generator

Bug 1385543 salt clause on a table with large number of primary

key columns returns error

Bug 1392450 Internal error 2005 when querying a Hive table with

an unsupported data type

In addition, it changes the following behavior:

- The _SALT_ column now gets added as the last column in the

CREATE TABLE statement, rather than the first column after

SYSKEY. The position of _SALT_ in the clustering key does

not change. This will cause some differences in INVOKE and

in the column number assigned to columns.

- For CREATE TABLE LIKE, the defaults of the WITH clauses

are changing. CREATE TABLE LIKE now copies constraints,

SALT and DIVISION clauses by default. The WITH CONSTRAINTS

clause is now the default and should no longer be used.


DIVISIONING clauses are supported.

- For CREATE INDEX ... SALT LIKE TABLE, we now give a

warning instead of an error if the table is not salted.

- Also added an optimization for BETWEEN predicates. If

part or all of them can be converted to an equals predicate,

we do this now. Example:

(a,b,c,d) between (1,2,3,4) and (1,2,5,6)

is transformed into

a=1 and b=2 and (c,d) between (3,4) and (5,6).

More detailed description of changes:

- arkcmp/CmoStoredProc.cpp


+ other files

Using the new FLAGS column in the COLUMNS metadata table to store

whether a column is a salt or divisioning column. Note that since

there may be existing salted tables without this flag set, the flag

is so far only reliable for divisioning columns.

- comexe/ComTdb.h




Changed the column class field in struct

ComTdbVirtTableColumnInfo from a string to the corresponding

enum. Sorry, this caused lots of small changes (deleting "_LIT"

from the initializers). Also added the column flags.

- executor/hiveHook.cpp: Added a check for partitioned tables

(having multiple SDs). This is part of the fix for

bug 1353632.

- GenRelUpdate.cpp: When generating the key encoding expression

for an insert inside a MERGE operation, we assumed the new

record expression was in the order of the key columns. Added

a step to sort by key column, so we can pass the expression

in any order.

- optimizer/ItemExpr.cpp


Added a named NATypeToItem item expression.

This is used to do a primitive "bind" operation of an item expression

when processing a DDL statement. Specifically, to bind the DIVISION BY

clause in a CREATE TABLE statement.

- optimizer/ItemFunc.h

optimizer/SynthType.cpp: The DDL time "binder" gets expressions as

they come out of the parser, e.g. a ZZZBinderFunction. Need to add

type synthesis for some cases of the ZZZBinderFunction.

- optimizer/NATable.cpp

Removing some dead code. Adding an error message when we encounter

a Hive column type we can't handle yet. Bug 1392450.

- optimizer/TableDesc.*

Method TableDesc::validateDivisionByClauseForDDL() got moved

to CmpSeabaseDDL::validateDivisionByExprForDDL().

- optimizer/NormItemExpr.cpp

BETWEEN transformation described above.

- optimizer/ValueDesc.cpp

Avoid hard-codeing the "_SALT_" name and adding a comment about

possibility to use the flag in the future.

- parser

Lots of small changes for salt and divisioning option changes.

Simplifying the syntax for salt options somewhat. I think the older

syntax was so complex because it needed to record the starting and

ending position of the divisioning clause, something we don't need


- regress: Adding new test

- sqlcomp/CmpDescribe.cpp: Support for describing DIVISION BY clause

and also supporting the new WITHOUT SALT | DIVISION options

for CREATE TABLE LIKE, which relies on the describe feature.

- sqlcomp/CmpSeabaseDDLcommon.cpp


+ Handling the new column flags and making sure they are not

confused with the HBase column flags (e.g. for serialization).

+ Setting the new COLUMNS.FLAGS when writing metadata.

+ Also, writing the computed column text to the TEXT table.

+ For DROP TABLE, unconditionally deleting TEXT rows, since the

table could contain computed columns.

+ When building ColInfoArray, check system column flags, since

system columns can now appear at any position.

+ Add method to "bind" an item expression during DDL processing

without going through the full binder. This replaces any column

reference with a named NATypeToItem node, since all we really

need is the type and the name for unparsing.

+ Method TableDesc::validateDivisionByClauseForDDL() got moved

to CmpSeabaseDDL::validateDivisionByExprForDDL() with some minor

adjustments, since it used to be called on a bound ItemExpr, now

it gets called on something that came out of the parser and went

through the DDL time "binder".

- sqlcomp/CmpSeabaseDDLindex.cpp:

Support for CREATE INDEX ... DIVISION LIKE TABLE. If this is

set, add the division columns in front of the index key, otherwise


- sqlcomp/CmpSeabaseDDLtable.cpp:

+ Code to make sure column flags and column class is set and propagated.

+ Fix for bug 1385543: Now that we use the TEXT table for computed

column text, we no longer have a length limit. This is true for both

divisioning and salt expressions.

+ When processing the column list in seabaseCreateTable() we have a

bit of a chicken and egg problem: We need the column list to validate

the DIVISION BY expressions, but the DIVISION BY columns need to be part

of the column list. So, we do this a first time without divisioning

columns, then we add those, and produce the final list in a second


+ getTextFromMD method now takes a sub-id as an input parameter. That's

the column number for computed column text.

+ read computed column text from the TEXT table. Note: This also needs

to handle older tables where the computed column text is stored in

the default value.

Change-Id: I7c3ebe39a950c1d01f31855bdc92cbb98e5eb275

fix cardinality errors for better DoP for OE

Problems fixed are as follows.

1. TableAnalysis::getLocalPredsOnPrefixOfList() prematurely breaks

out of the loop to collect predicates on key columns.

2. TableDesc::getSaltColumnAsSet() does not return the result in

the form of base columns valueIds.

3. When the dop is estimated lower than the low bound of ESPs, the

low bound value is used.

4. SimpleFileScanOptimizer::computeSingleSubsetSize() incorrectly

applies the computed predicates on salted column which reduces the

subset size by CQD HIST_NO_STATS_UEC (default to 2). A subset size

specifies the # of rows processed.

Change-Id: I85bebf0d31d8f3e2db5bfe7c00cb4467b53308ac

Index-join scan trimming heuristics rework II

Change-Id: I9cf63be7967a012559e27dc4ca950bb28b8ccd3b

DoP adjustment for small queries. Rework

Change-Id: Idd70130ca07dc4c82eb2a83422093789fd510c81

  1. … 8 more files in changeset.
Squashed commit of the following:

commit 221d4199001b3f06d5629b82ed1281bdfb95f043

Author: qchen <>

Date: Mon Sep 8 19:32:05 2014 +0000

DoP: use a better version of resource estimator

Change-Id: Idf6ea5caa7e4915c65b4d54f58d7e483c26871f4

commit 75c92f255ad3841102757f6188d749db42eaedf3

Author: qchen <>

Date: Fri Sep 5 15:55:11 2014 +0000

do not add ESP partition requirement to partial GB leaf1 and leaf2.

commit f7c652adf2953d18f10da4a952f78d3815b5b3d4

Author: qchen <>

Date: Fri Sep 5 15:20:40 2014 +0000


commit a419954b1528ca0d1f84c5298b806b46f561a713

Author: qchen <>

Date: Wed Sep 3 23:45:13 2014 +0000

add a check on stats smart ptr before its use.

commit 8a2dcc24f0081d554ec3d69560c050f720da04dd

Author: qchen <>

Date: Wed Sep 3 18:31:50 2014 +0000

partial gb root running in Master; use rowcount from stats for Hbase

in AppliedStatMan::getStatsForCANodeId().

commit 55f46ededa9fcb30d38d17be6692e1535b4a274c

Author: qchen <>

Date: Fri Aug 29 17:46:00 2014 +0000

improve the logic to disable parallel GB partial root when rows < 5000

commit 743783720d030f38e50a065737c8c218fa47c6b5

Author: Ravisha Neelakanthappa <>

Date: Wed Aug 20 17:20:09 2014 +0000

Fix for bug 1348317. Enable Adaptive Segmentation logic to

compute DoP based on resource estimation.

Change-Id: I7c2b0756854d5a86d8bde162dad064b4e924233c

Change-Id: Id68e1d8265e478f128ed61d1cabc4849873dae23

Code cleanup (06042014)

Change-Id: Ib7f2f39e42085e026ac1000e43048c9a185b6976

Removed obsolete makefiles and makefile parts.

Added copyright date in sql/optimizer/Analyzer.cpp

Removed this unnecessary output from when SQ_VERBPOSE is 1:


* 64-bit build selected *


Change-Id: I25af40141f20aac5e21096811555ab8993b6b5f0

Initial code drop of Trafodion

