Clone Tools
Constraints: committers
Constraints: files
Constraints: dates
Fix for hive regr failure

The interface change to pass in lobMax Size that is relevant only for

lob columns affected a common interface used by the hive interface.

Fixed it to pass 0 for lobMaxSize into that interface when called for

accepssing hive tables. All hive tests pass.

Change-Id: Ibfbf04057ac09d60960be20f36ff5f583fe783fd

initial support for returning multiple versions and column timestamps

This feature is not yet externalized.

Support added to:

-- return multiple versions of rows

-- select * from table {versions N | MAX | ALL}

-- get hbase timestamp of a column

-- select hbase_timestamp(col) from t;

-- get version number of a column.

-- select hbase_version(col) from t

Change-Id: I37921681fc606a22c19d2c0cb87a35dee5491e1e

  1. … 41 more files in changeset.
Fixes from review to sqvers

commit 04f3812f112a5629a563f02d7e72c5fa503c6a8d

Author: Sandhya Sundaresan <>

Date: Sun Jun 14 04:23:21 2015 +0000

Preliminary checkin of lob support for external files. Inserts from http

files, hdfs files and lob local files are supported. Added support for

new extract synttax. Extract from lob

columns to hdfs files has been added . More work needed to support

binary files and very large files . Current limit is 1G.

Also fixed some error handling issues

Fixed some substring warning issues in the lobtostring/stringtolob


Added references and interfaces to curl library that is needed to read external http


More work needed before this support can be used

Change-Id: Ieacaa3e4b7fa2a040764888c90ef0b029f107a8b

Change-Id: Ife3caf13041f8106a999d06808b69e5b6a348a6b

  1. … 22 more files in changeset.
Migrate from log4cpp to log4cxx

This change is a wholesale removal of log4cpp from source tree.

log4cxx is an external library installed via RPM, or user build

to default /usr/lib64 and /usr/include directories. Some of the

QRLogger and CommonLogger code was changed to use the new log4cxx


Change-Id: I248bac0a8ffbfea6cbc1ba847867b30638892eae

  1. … 208 more files in changeset.
Code change for ESP colocation strategy described in LP 1464306

Change-Id: I838bc57541bf753e3bad90ae7776e870958c930a

  1. … 13 more files in changeset.
Batch inserts for varchar columns > 32k failing with exception

The message id: batch_command_failed

at org.trafodion.jdbc.t4.TrafT4PreparedStatement.executeBatch


at TestBigCol.JDBCBigColBatch1(

Temporary fix - will be changed to use vcindlen later.

Changes from the following commit needed, along with this mxosrvr


"-- changed sizeof(short) to correct vcindlen (2 or 4 bytes)":

Handled one of the review comments in 1714.

Change-Id: I52e6d9380fa569b50a5f97c71f3cbef524a165c9

  1. … 1 more file in changeset.
various fixes and enhancements, details below.

-- improved DDL performance by not invalidating internal create/alter


-- added an optimization during CREATE INDEX to not go through

'upsert using load' processing if source table is empty.

-- added support for ISO datetime format (2015-06-01T07:35:20Z)

-- added support for RESET option to ALTER SEQUENCE and IDENTITY.

This will reset generated seq num to the START VALUE.

-- added support for cqd TRAF_STRING_AUTO_TRUNCATE.

If set, strings will be automatically truncated during insert/update.

-- fixed sqlci to pass in correct varchar param len indicator (2 or 4 bytes).

-- changed sizeof(short) to correct vcindlen (2 or 4 bytes)

-- removed some NA_SHADOWCALLS defines

Change-Id: Ie6715435d9c210ae6c2db4ff6bc0545c1b196979

  1. … 36 more files in changeset.

blueprint alter-table-hbase-options

This set of changes includes non-transactional HBase semantics


change, two sets of logic that manipulate HBase options in

the CREATE TABLE code path were refactored so that logic

could be reused by ALTER TABLE/INDEX. In both these cases,

the logic in question was taken out of a create-side method

and moved into a separate method.

Still to come is the equivalent transactional support, and

also a regression test script that exercises this feature.


but only in a non-transactional manner. That is, if the

operation is aborted after the request has been issued to HBase,

the change to the Trafodion metadata may be rolled back but

the change to the underlying HBase object may still complete.

Change-Id: Ied11d0b2ac7b7d65110d39a1e5a7fbbac1e1f827

  1. … 7 more files in changeset.
Move core into subdir to combine repos

  1. … 10754 more files in changeset.
Move core into subdir to combine repos

  1. … 10608 more files in changeset.
Move core into subdir to combine repos

Use: git log --follow -- <file>

to view file history thru renames.

  1. … 10823 more files in changeset.
Fix bug 1323826 - SELECT with long IN predicate causes core file

Actually, this check-in does not completely fix the problem, but

it does allow IN predicates (and NOT IN predicates) to have a list

with as many as 3100 items in the list.

NOTE: There are many places in the SQL Compiler code that use recursion.

The changes in this check-in address the issue for long IN lists

and, to some extent, for INSERT statements that attempt to insert

many rows with a single INSERT statement. However, it is still possible

for someone to try a list that is too long. As you make the lists

longer, you find more recursive routines that have the same type of

problem(s) that are being fixed for certain routines by this check-in.

This check-in also fixes a couple of minor problems in the logic used to

debug Native Expressions code. These problems were in

.../sql/generator/Generator.cpp and


There were 3 different techniques used to reduce the stack space usage of

various recursive routines that get invoked as a result of long IN lists

or NOT IN lists:

1) Move variables from the stack to heap.

2) Recode the recursive routine to pull out sections of code (not needed

during the recursion) and put those in their own routine. This cuts

the stack space usage because it enables the C++ compiler to generate

code for the recursive routine that needs significantly less stack


3) Declare variables of type ARRAY on the stack (where the ARRAY

overhead is allocated from stack, but the contents come from heap)

to hold certain pieces of data where each recursive level of calling

needs its own value for the variable AND then change the code to use a

'while' loop to process the nodes in the node tree in the same order

that the original recursive routine would have processed the nodes.

Files changed for reducing stack space usage:

sql/optimizer/ItemCache.cpp - use method 2 on ItemExpr::generateCacheKey()

sql/optimizer/NormItemExpr.cpp - use method 2 on ItemExpr::normalizeNode()

and method 1 on BiLogic::predicateEliminatesNullAugmentedRows()

sql/generator/GenPreCode.cpp - use method 2 on


sql/optimizer/ItemExpr.cpp - use method 2 on ItemExpr::unparsed()

AND ItemExpr::ItemExpr::synthTypeAndValueId()

sql/optimizer/OptRange.cpp - use method 3 on OptRangeSpec::buildRange()

sql/optimizer/BindItemExpr.cpp - use method 3 on


sql/optimizer/NormRelExpr.cpp - use method 3 on


sql/optimizer/ItemExpr.h - declare new methods that were created

sql/optimizer/ItemLog.h - declare new methods that were created

Finally, this check-in changes the default value for a CQD named

PCODE_MAX_OPT_BRANCH_CNT from 19000 to 12000. This was to fix a problem

where we used too much *heap* space when we tried to optimize a PCODE

Expression that had too many separate blocks of PCODE instructions (such

as results from a very long NOT IN list.) With this change, we will

choose to run with unoptimized PCODE if trying to optimize the PCODE

would result in overflowing the heap space available.

Change-Id: Ie8ddbab07de2a40095a80adac7873db8c5cb74ac

  1. … 11 more files in changeset.
Configuring hbase option MAX_VERSION via SQL

Change-Id: I88041d539b24de1289c15654151f5320b67eb289

  1. … 10 more files in changeset.
Merge "Enabling Bulk load and Hive Scan error logging/skip feature"

  1. … 6 more files in changeset.

blueprint alter-table-hbase-options

This set of changes includes the following:


2. Compiler binding and generation logic

3. Run-time logic to update the metadata TEXT table with the

new options.

Still to come is:

1. Logic to actually change the HBase object

2. Transactional logic to actually change the HBase object

The functionality in this set of changes is only marginally

useful. If you manually change hbase options on a Trafodion

object using the hbase shell, you could use this ALTER

TABLE/INDEX command to update the Trafodion metadata. (Of

course some care would have to be taken to use the same


Change-Id: Id0a5513fe80853c06acdbbf6cc9fd50492fd07b2

  1. … 18 more files in changeset.
Enabling Bulk load and Hive Scan error logging/skip feature

Also Fixed the hanging issue with Hive scan (ExHdfsScan operator) when there

is an error in data conversion.

ExHbaseAccessBulkLoadPrepSQTcb was not releasing all the resources when there

is an error or when the last buffer had some rows.

Error logging/skip feature can be enabled in

hive scan using CQDs and in bulk load using the command line options.

For Hive Scan


CQD TRAF_LOAD_LOG_ERROR_ROWS ‘ON’ to log the error rows in Hdfs files.

For Bulk load

LOAD WITH CONTINUE ON ERROR [TO <location>] – to skip error rows

LOAD WITH LOG ERROR ROWS – to log the error rows in hdfs files.

The default parent error logging directory in hdfs is /bulkload/logs. The error

rows are logged in subdirectory ERR_<date>_<time>. A separate hdfs file is

created for every process/operator involved in the bulk load in this directory.

Error rows in hive scan are logged in


Error rows in bulk upsert are logged in


Bulk load can also aborted after a certain number of error rows are seen using


Change-Id: Ief44ebb9ff74b0cef2587705158094165fca07d3

  1. … 30 more files in changeset.
Using the language manager for UDF compiler interface

blueprint cmp-tmudf-compile-time-interface

This change includes new CLI calls, to be used in the compiler to

invoke routines. Right now, only trusted routines are supported,

executed in the same process as the caller, but in the future we may

extend this to isolated routines. Using a CLI call allows us to share

the language manager between compiler and executor, since language

manager resources such as the JVM and loaded DLLs exist only once per

process. This change is in preparation for Java UDFs.

Changes in a bit more detail:

- Added 4 new CLI calls to allocate a routine, invoke it, retrieve

updated invocation and plan infos and deallocate (put) the routine.

The CLI globals now have a C/C++ and a Java language manager that

is allocated on demand.

- The compiler no longer loads a DLL for the UDF compiler interface,

it uses the new CLI calls instead.

- DDL syntax is changed to allow TMUDFs in Java (not officially

supported, so don't use it quite yet).

- TMUDFs in C are no longer supported, only C++ and Java are.

Converted remaining TMUDF tests to C++.

- C++ TMUDFs now do a basic verification at DDL time, so errors

like missing entry points are detected earlier. Validation for

Java TMUDFs is also done through the CLI.

- Make sure we have no memory or resource leaks:

- CmpContext keeps track of UDF-related objects allocated on

system heap and in the CLI, cleaned up at the end of a statement

- CLI keeps a list of allocated trusted routines, cleaned up

when a CLI context is deallocated

- Using ExeCliInterface class to make the new CLI calls (4 new calls


- Removed CmpCli class in the optimizer directory and converted

tracking compiler to use ExeCliInterface as well.

- Compile-time parameter values are no longer baked into the

UDRInvocationInfo. Instead, they are provided as an input row, the

same way as they are provided at runtime.

- Bug fixes in C++ UDR code, mostly related to serialization and

to multiple interactions with the UDF through serialized objects.

- Added more info to UDRInvocationInfo (SQL access type, etc.).

- Since there are multiple plans per invocation, each of which

can have multiple interactions with the UDF, plans need to be

numbered so the UDF side can tell them apart to attach the

right state (owned by the UDF) to it.

- The language manager needs some functions that are provided by

the process it's running in. Added those (empty, for now) functions

as cli/CliImplLmExtFunc.cpp.

- Added a new class for Java TMUDFs, LmRoutineJavaObj. Added methods

to allocate such routines and to load their class as well as to

create Java objects by invoking the default constructor through JNI.

- Java TMUDFs use the new UDR interface (to be provided by Suresh and

Pavani). In the language manager, the container is the class of

the UDF, the external path is the fully qualified jar name. The

Java method name is <init>, the default constructor, with signature

"()V". Some code changes were required to do this.

- Created a new directory trafodion/core/sql/src for Java sources in

the sql engine. Right now, only language manager java

sources are in this directory, but I am planning to move the other

java sources under sql in a future checkin. Suresh and Pavani

will add their UDF-related Java files there as well.

- Renamed the udr jar to trafodion-sql-<version>.jar, in anticipation

of combining all the sql Java sources into this jar.

- Created a maven project file trafodion/core/sql/pom.xml and

changed makefiles to invoke maven to build java sources.

- More work to separate new UDR interface from older SPInfo object,

so that we can get rid of SPInfo if/when we don't support the older

style anymore.

- Small fix to odb makefile, make clean failed when executed twice.

Patch set 2: Adding a custom filter for test regress/udr/TEST108.

Change-Id: Ic827a42ac25505fb1ee451b79636c0f9349d8841

  1. … 98 more files in changeset.
DDL Tx, Changes to handle upsert using load operations on abort

SQL statements 'upsert using load' will be executed outside of a

transaction even though the operation can performed inside a

transaction. For this reason, it is necessary for us to handle this

data differently when a transaction aborts or fails.

These operations will be registered as part of the TmDDL manager

and if the transaction is aborted, that table gets truncated. The data

loaded will be deleted but table will still exist. At the moment, this is

handled by using disable, delete and create hbase methods. However, when

hbase is upgraded to the next version, we will use the truncate hbase

method option.

Change-Id: Ica9d81a284b5f1821a3048b9c8deaad449a4c4f4

  1. … 16 more files in changeset.
Bug #1451618 MXOSRVR returns trunctated string to clients

When a table has varchar column and the varchar column size

is defined bigger than 32K, MXOSRVR server may return truncated

string to client application in certain circumstances.

Fix in sql/cli/CliExpExchange.cpp to not clear trailing characters.

As per SQL team this code is not needed and is a left over.

Also fixed alignment issues as pointed out by SQL and Connectivity

team (related to change done for bug #1449343)

This alignment change in mxosrvr needs a corresponding change in

linux and windows odbc driver. Windows driver change will be

submitted as a separate checkin.

Patch1 - removed one blank line.

Change-Id: Iabde0d3a0ae2922e15dbdcd36d23ba367da967fb

  1. … 3 more files in changeset.
get indexLevel and blockSize from Hbase metadata to use in costing code.

Change-Id: I7b30364ec83a763d3391ddc39e12adec2ca1bd00

  1. … 8 more files in changeset.
Changes to enable Rowset select - Fix for bug 1423327

HBase always returns an empty result set when the row is not found. Trafodion

is changed to exploit this concept to project no data in a rowset select.

Now optimizer has been enabled to choose a plan involving Rowset Select

where ever possible. This can result in plan changes for the queries -

nested join plan instead of hash join,

vsbb delete instead of delete,

vsbb insert instead of regular insert.

A new CQD HBASE_ROWSET_VSBB_SIZE is now added to control the hbase rowset size.

The default values is 1000

Change-Id: Id76c2e6abe01f2d1a7b6387f917825cac2004081

  1. … 18 more files in changeset.
Fix LP bug 1446402 - LIKE patterns longer than 127 chars don't work well

If a fixed part of a LIKE pattern is longer than 127 characters, then

you get "matches" on column values that should not match. An example of

such a pattern would be:


where the fixed part [the part between two % (or _) characters]

is 128 characters long.

The root cause of the problem was another place in PCODE logic

where a signed char was being used to hold a length value.

By using an unsigned char, we can go up to 255 chars in a fixed part

of a LIKE pattern. If a fixed part is longer than 255, the SQL Compiler

should not be attempting to use PCODE for the LIKE predicate so things

should be fine.

Change-Id: I2ff8e00dedeb3145602f57eed7418ea7b3c17a77

Add new tmlib callback to propagate startid.

Change-Id: I42dac4838228c14d445bed2692dceaae2586a36e

  1. … 22 more files in changeset.
Fix LP Bug 1382686 - LIKE predicate fails on some UTF8 columns

The description of this LP bug says that the LIKE predicate fails when

the pattern string starts with a '%' and when selecting from a view, but

not when selecting from the underlying table. However, in the supplied

reproducible test case, the view's column had a character set of UTF8

while the underlying table had a character set of UCS2.

As it turns out, the real problem is not related to selecting from a

view. The root cause is a bug in the PCODE implementation for the LIKE

predicate and this problem can occur any time the LIKE predicate is

applied to a column declared as VARCHAR(nnn) CHARACTER SET UTF8 where

128 <= nnn <= 255... and the problem may not be limited to situations

where the LIKE pattern starts with a '%'.

When nnn > 255, the PCODE implmentation is not used, so the problem does

not occur then.

The root cause is a place in the code where the length of the column (in

characters, not bytes) is stored in a single byte and is retrieved as if

that byte contains a *signed* value. When 128 <= nnn <= 255, that

retrieval results in a negative value. The fix is to retrieve the value

as an *unsigned* value.

NOTE: This commit changes 2 lines of code. The second line changed is

the only one necessary to fix the problem. The first line is being

changed as well because it has a similar problem and would prevent the

LIKE predicate from working properly if the LIKE pattern had more than

127 pattern parts.

Change-Id: Ideb063cbd62b9155e9b1f579bcd0edb187e8a1c8

Fix for bug 1442932 and bug 1442966, encoding for varchar

Submitting this before finishing regressions on workstation, in the

interest of time.

Key encodings for VARCHAR values used to put a varchar length indicator

in front of the encoded value. The value was the max. length of the

varchar and the indicator was 2 or 4 bytes long, depending on the

length of the indicator in the source field. That length used to

depend only on the max number of bytes in the field, for >32767

bytes we would use a 4 byte VC length indicator.

Now, with the introduction of long rows, the varchar indicator length

for varchars in aligned rows is always 4 bytes, regardless of the

character length. This causes a problem for the key encoding.

We could have computed the encoded VC indicator length from the field

length. Anoop suggested a better solution, not to include the VC

indicator at all, since that is unnecessary. Note that for HBase row

keys stored on disk, we already remove the VC indicator by converting

such keys from varchar to fixed char. Therefore, the issue happens

only for encoding needed in a query, for example when sorting or in a

merge join or union.

Description of the fix:

1. Change CompEncode::synthType not to include the VC length

indicator in the encoded buffer. This change also includes

some minor code clean-up.

2. Change the assert in CompEncode::codeGen not to include the

VC indicator length anymore.

3. Changes in ex_function_encode::encodeKeyValue():

a) Read 2 and 4 byte VC length indicators for VARCHAR/NVARCHAR.

b) Small code cleanup, don't copy buffer for case-insensitive

encode, since that is not necessary.

c) Don't write max length as VC length indicator into target

and adjust target offsets accordingly (for VARCHAR/NVARCHAR).

4. Other changes in sql/exp/exp_function.cpp:

d) Handle 2 and 4 byte VC len indicators in hash function

and Hive hash function (problems unrelated to LP bugs fixed).

e) Add some asserts for cases where we assume VC length indicator

is a 2 byte integer.

CompDecode is not yet changed. Filed bug 1444134 to do that for

the next release, since that change is less urgent.

Patch set 2: Copyright notice changes only.

Patch set 3: Updated expected regression test file that

prints out encoded key in hex.

Change-Id: Idab3ed488f8c1b9aabedba4689bfb8d7286b9538

  1. … 4 more files in changeset.
DDL Transactions,drop end to end & prepareCommit

Change-Id: I69b1d6b3babcaf61761c821bac091fd2101be729

  1. … 6 more files in changeset.
additional changes to support ALIGNED row format.

This feature is not externalized yet.

Change-Id: Idbf19022916d437bb7bb69019194de5057cbcb65

  1. … 20 more files in changeset.
Transactional DDL, Drop Table functionality

Added Transactional DDL, Drop Table functionality from SQL side to the

TransactionManager in the TM java side.

Code has been tested with create table/drop table flow with sqlci.

Change-Id: I3fc7d8cf395cd2ae06884c92d1d0c583a3100ea9

  1. … 15 more files in changeset.
Merge "Eliminate manual steps in load/ustat integration"

  1. … 1 more file in changeset.
Fix LP 1328250 - SQL Compiler hits "assert(FALSE)" in getIntConstValue

The SQL Compiler was abending and producing a core file in

getIntConstValue() when the Type of the integer constant was


This is code that is invoked only by the Native Expressions feature.

Currently, if the Native Expressions code encounters any PCODE

instruction that is designed for MBIGS or MBIGU type data, no native

expression is generated. Instead, we continue running with the PCODE

instructions as they are.

However, if PCODE optimization converts an MBIG[SU] type instruction

into a simpler instruction such as a 64-bit move instruction, then

the Native Expressions code goes ahead and tries to produce a native

expression. The problem is that, although PCODE optimization changes

to a simpler PCODE instruction, it does not change the data type of

any integer constant referenced by the instruction. The constant is

left as being an MBIG[SU] type even though the instruction's operands

say that the operand is a standard integer type.

The fix is to use the operand's idea of the data type rather than

the constant's idea of the data type whenever the constant is of

MBIGS or MBIGU type. This technique is already used by the code

when a Floating-Point type constant is referenced by a PCODE

instruction that says the operand is a simpler integer type.

Change-Id: I0e1aa28a51b8f5311ac51d0da5f385dc93bcd4eb