executor

Clone Tools
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
initial support for returning multiple versions and column timestamps

This feature is not yet externalized.

Support added to:

-- return multiple versions of rows

-- select * from table {versions N | MAX | ALL}

-- get hbase timestamp of a column

-- select hbase_timestamp(col) from t;

-- get version number of a column.

-- select hbase_version(col) from t

Change-Id: I37921681fc606a22c19d2c0cb87a35dee5491e1e

  1. … 43 more files in changeset.
Fixes from review to sqvers

commit 04f3812f112a5629a563f02d7e72c5fa503c6a8d

Author: Sandhya Sundaresan <sandhya.sundaresan@hp.com>

Date: Sun Jun 14 04:23:21 2015 +0000

Preliminary checkin of lob support for external files. Inserts from http

files, hdfs files and lob local files are supported. Added support for

new extract synttax. Extract from lob

columns to hdfs files has been added . More work needed to support

binary files and very large files . Current limit is 1G.

Also fixed some error handling issues

Fixed some substring warning issues in the lobtostring/stringtolob

functions.

Added references and interfaces to curl library that is needed to read external http

files.

More work needed before this support can be used

Change-Id: Ieacaa3e4b7fa2a040764888c90ef0b029f107a8b

Change-Id: Ife3caf13041f8106a999d06808b69e5b6a348a6b

  1. … 27 more files in changeset.
Migrate from log4cpp to log4cxx

This change is a wholesale removal of log4cpp from source tree.

log4cxx is an external library installed via RPM, or user build

to default /usr/lib64 and /usr/include directories. Some of the

QRLogger and CommonLogger code was changed to use the new log4cxx

APIs.

Change-Id: I248bac0a8ffbfea6cbc1ba847867b30638892eae

  1. … 205 more files in changeset.
Code change for ESP colocation strategy described in LP 1464306

Change-Id: I838bc57541bf753e3bad90ae7776e870958c930a

  1. … 12 more files in changeset.
Reworked fix for 1452424 VSBB scan cause query to return wrong result

The assumption that rowID should be exactly equal to the calculated

key length is not correct. Trafodion SQL engine uses a null extra byte

to ensure that the row ID is not found in case of data conversion errors

Hence direct ROWID buffer format is changed to accommodate this.

Now the format is

numRowIds + rowIDSuffix + rowId + rowIDSuffix + rowId + …

rowIDSuffix is '0' then the subsequent rowID is of length

= passed rowID len

'1' then the subsequent rowID is of length

= passed rowID len+1

Change-Id: I07a283895f6f9c652b3f933bcf0330b69ee2d300

Merge "Part 2 of ALTER TABLE/INDEX ALTER HBASE_OPTIONS support"

Removed about 50 obsolete make files from sql

Security is not built inside sql, and sqlutils is not in Trafodion, so

their makefiles are not needed and have been deleted. Also deleted a few

other unused makefiles. In modified makefiles, removed unused parts.

The sql/nskgmake/Makerules* files still have references to sqlutils, those

lines will be considered in a separate change.

In second patch set, deleted unused directory sql/nskgmake/exeindp2

In third patch, added missing copyrights in

sql/executor/OrcFileReader.java

sql/regress/udr/TEST101.java

and deleted empty file

sql/parser/StmtDDLAlterTableAlterColumnRecalibrateSG.h

Change-Id: Id9ab492d4a23579cd0ab6db7279e0da56c55ea4c

  1. … 65 more files in changeset.
Additional fix for 1452424 VSBB scan cause query to return wrong result

VSBB scan was returning wrong results randomly even for the same query.

An extra null byte was added in the row id at times while setting up the

unique key. This caused a shift in row id parsing at java layer leading

to row not found for all row ids after this extra byte.

In addition, VSSB select was not getting GET_EOD or GET_NOMORE queue

entries at the end of the query. Hence, the rowset was never getting

closed causing a resource leak.

Also, retained the other changes that helped in debugging this issue.

Change-Id: I6a807bddc8edaff2f4140931d4f228e94badcc05

Part 2 of ALTER TABLE/INDEX ALTER HBASE_OPTIONS support

blueprint alter-table-hbase-options

This set of changes includes non-transactional HBase semantics

for ALTER TABLE/INDEX ALTER HBASE_OPTIONS. As part of this

change, two sets of logic that manipulate HBase options in

the CREATE TABLE code path were refactored so that logic

could be reused by ALTER TABLE/INDEX. In both these cases,

the logic in question was taken out of a create-side method

and moved into a separate method.

Still to come is the equivalent transactional support, and

also a regression test script that exercises this feature.

With this change, ALTER TABLE/INDEX ALTER HBASE_OPTIONS works

but only in a non-transactional manner. That is, if the

operation is aborted after the request has been issued to HBase,

the change to the Trafodion metadata may be rolled back but

the change to the underlying HBase object may still complete.

Change-Id: Ied11d0b2ac7b7d65110d39a1e5a7fbbac1e1f827

  1. … 7 more files in changeset.
Move core into subdir to combine repos

  1. … 10754 more files in changeset.
Move core into subdir to combine repos

  1. … 10608 more files in changeset.
Move core into subdir to combine repos

Use: git log --follow -- <file>

to view file history thru renames.

  1. … 10823 more files in changeset.
Rework for incremental IM during bulk load

Address comments by Hans and fix 1 regression failure

A regression failure in executor/test013 was caused due to how external

names as used with volatile indexes. This has been fixed in GenRelExeUtil.cpp

The parser change suggested could not be made due to increasing conflicts.

Thank you for the feedback.

Change-Id: Icdf5dbbf90673d44d5d0ccb58086266520fcf5c3

  1. … 5 more files in changeset.
Configuring hbase option MAX_VERSION via SQL

Change-Id: I88041d539b24de1289c15654151f5320b67eb289

  1. … 9 more files in changeset.
Changes in Patchset2

Fixed issues found during review.

Most of the changes are related to disbling this change for unique indexes.

When a unique index is found, they alone are disabled during the load.

Other indexes are online and are handled as described below. Once the base

table and regular indexes have been loaded, unique indexes are loaded from

scratch using a new command "populate all unique indexes on <tab-name>".

A simlilar command "alter table <tab-name> disable all unique indexes"

is used to disable all unique indexes on a table at the start of load.

Cqd change setting allow_incompatible_assignment is unrelated and fixes an

issue related to loading timestamp types from hive.

Odb change gets rid of minor warnings.

Thanks to all three reviewers for their helpful comments.

-----------------------------------

Adding support for incremental index maintenance during bulk load.

Previously when bulk loading into a table with indexes, the indexes are first

disabled, base table is loaded and then the indexes are populated from

scratch one by one. This could take a long time when the table has significant

data prior to the load.

Using a design by Hans this change allows indexes to be loaded in the same

query tree as the base table. The query tree looks like this

Root

|

NestedJoin

/ \

Sort Traf_load_prep (into index1)

|

Exchange

|

NestedJoin

/ \

Sort Traf_load_prep (i.e. bulk insert) (into base table)

|

Exchange

|

Hive scan

This design and change set allows multiple indexes to be on the same tree.

Only one index is shown here for simplicity. LOAD CLEANUP and LOAD COMPLETE

statements also now perform these tasks for the base table along with all

enabled indexes

This change is enabled by default. If a table has indexes it will be

incrementally maintained during bulk load.

The WITH NO POPULATE INDEX option has been removed

A new option WITH REBUILD INDEXES has been added. With this option we get

the old behaviour of disabling all indexes before load into table and

then populate all of them from scratch.

Change-Id: Ib5491649e753b81e573d96dfe438c2cf8481ceca

  1. … 31 more files in changeset.
RMS and other related changes

The process level statistics at ESP level can now be obtained

using RMS while the query is running. The SQL command is

select * from table(statistics(null, 'QID=<qid>,DETAIL=1'))

To get a list of ESPs processes participating in the query execution, issue

select distinct tdb_id, text process_name from

table(statistics(null, 'QID=<qid>,DETAIL=1')) order by tdb_id

Explain information can now be obtained using RMS. To do this

CQD EXPLAIN_IN_RMS 'ON' should be issued in the session before the

query is prepared.

EXPLAIN OPTIONS 'F' FOR QID <qid> from a different session can give

the explain info.

TESTRTS is now incorporated into core test suite.

AQR-ed queries were not getting garbage collected from RMS shared segment.

This has been fixed.

Missing code in IpcGuardinServer::serverDied method is now added.

Change-Id: Ib591ee5c9251ab6778e3dec9450f5b0466041e9b

  1. … 15 more files in changeset.
Fix for 1452424 vsbb scan/delete cause query to return wrong result

VSBB update/delete were not tracking the number of rows in the

buffer.This has been corrected.

Change-Id: I2a89ccd9a84832c4771481de2ee8503e912ce0d8

  1. … 2 more files in changeset.
various lp and other fixes, details below.

-- added support for self referencing constraints

-- limit clause can now be specified as a param

(select * from t limit ?)

-- lp 1448261. alter table add identity col is not allowed and now

returns an error

-- error is returned if a specified constraint in an alter/create statement

exists on any table

-- lp 1447343. cannot have more than one identity columns.

-- embedded compiler is now used to get priv info during invoke/showddl.

-- auth info is is not reread if already initialized

-- sequence value function is now cacheable

-- lp 1448257. inserts in volatile table with identity column now work

-- lp 1447346. inserts with identity col default now work if inserted

in a salted table.

-- only one compiler is now needed to process ddl operations with or

without authorization enabled

-- query cache in embedded compiler is now cleared if user id changes

-- pre-created default schema 'SEABASE' can no longer be dropped

-- default schema 'SCH' is automatically created if running regressions

and it doesn't exist.

-- improvements in regressions run.

-- regressions run no longer call a script from another sqlci session

to init auth, create default schema

and insert into defaults table before every regr script

-- switched the order of regression runs

-- updates from review comments.

Change-Id: Ifb96d9c45b7ef60c67aedbeefd40889fb902a131

  1. … 68 more files in changeset.
Enabling Bulk load and Hive Scan error logging/skip feature

Also Fixed the hanging issue with Hive scan (ExHdfsScan operator) when there

is an error in data conversion.

ExHbaseAccessBulkLoadPrepSQTcb was not releasing all the resources when there

is an error or when the last buffer had some rows.

Error logging/skip feature can be enabled in

hive scan using CQDs and in bulk load using the command line options.

For Hive Scan

CQD TRAF_LOAD_CONTINUE_ON_ERROR ‘ON’ to skip errors

CQD TRAF_LOAD_LOG_ERROR_ROWS ‘ON’ to log the error rows in Hdfs files.

For Bulk load

LOAD WITH CONTINUE ON ERROR [TO <location>] – to skip error rows

LOAD WITH LOG ERROR ROWS – to log the error rows in hdfs files.

The default parent error logging directory in hdfs is /bulkload/logs. The error

rows are logged in subdirectory ERR_<date>_<time>. A separate hdfs file is

created for every process/operator involved in the bulk load in this directory.

Error rows in hive scan are logged in

<sourceHiveTableName>_hive_scan_err_<inst_id>

Error rows in bulk upsert are logged in

<destTrafTableName>_traf_upsert_err_<inst_id>

Bulk load can also aborted after a certain number of error rows are seen using

LOAD WITH LOG ERROR ROWS, STOP AFTER <n> ERROR ROWS option

Change-Id: Ief44ebb9ff74b0cef2587705158094165fca07d3

  1. … 22 more files in changeset.
Using the language manager for UDF compiler interface

blueprint cmp-tmudf-compile-time-interface

This change includes new CLI calls, to be used in the compiler to

invoke routines. Right now, only trusted routines are supported,

executed in the same process as the caller, but in the future we may

extend this to isolated routines. Using a CLI call allows us to share

the language manager between compiler and executor, since language

manager resources such as the JVM and loaded DLLs exist only once per

process. This change is in preparation for Java UDFs.

Changes in a bit more detail:

- Added 4 new CLI calls to allocate a routine, invoke it, retrieve

updated invocation and plan infos and deallocate (put) the routine.

The CLI globals now have a C/C++ and a Java language manager that

is allocated on demand.

- The compiler no longer loads a DLL for the UDF compiler interface,

it uses the new CLI calls instead.

- DDL syntax is changed to allow TMUDFs in Java (not officially

supported, so don't use it quite yet).

- TMUDFs in C are no longer supported, only C++ and Java are.

Converted remaining TMUDF tests to C++.

- C++ TMUDFs now do a basic verification at DDL time, so errors

like missing entry points are detected earlier. Validation for

Java TMUDFs is also done through the CLI.

- Make sure we have no memory or resource leaks:

- CmpContext keeps track of UDF-related objects allocated on

system heap and in the CLI, cleaned up at the end of a statement

- CLI keeps a list of allocated trusted routines, cleaned up

when a CLI context is deallocated

- Using ExeCliInterface class to make the new CLI calls (4 new calls

added).

- Removed CmpCli class in the optimizer directory and converted

tracking compiler to use ExeCliInterface as well.

- Compile-time parameter values are no longer baked into the

UDRInvocationInfo. Instead, they are provided as an input row, the

same way as they are provided at runtime.

- Bug fixes in C++ UDR code, mostly related to serialization and

to multiple interactions with the UDF through serialized objects.

- Added more info to UDRInvocationInfo (SQL access type, etc.).

- Since there are multiple plans per invocation, each of which

can have multiple interactions with the UDF, plans need to be

numbered so the UDF side can tell them apart to attach the

right state (owned by the UDF) to it.

- The language manager needs some functions that are provided by

the process it's running in. Added those (empty, for now) functions

as cli/CliImplLmExtFunc.cpp.

- Added a new class for Java TMUDFs, LmRoutineJavaObj. Added methods

to allocate such routines and to load their class as well as to

create Java objects by invoking the default constructor through JNI.

- Java TMUDFs use the new UDR interface (to be provided by Suresh and

Pavani). In the language manager, the container is the class of

the UDF, the external path is the fully qualified jar name. The

Java method name is <init>, the default constructor, with signature

"()V". Some code changes were required to do this.

- Created a new directory trafodion/core/sql/src for Java sources in

the sql engine. Right now, only language manager java

sources are in this directory, but I am planning to move the other

java sources under sql in a future checkin. Suresh and Pavani

will add their UDF-related Java files there as well.

- Renamed the udr jar to trafodion-sql-<version>.jar, in anticipation

of combining all the sql Java sources into this jar.

- Created a maven project file trafodion/core/sql/pom.xml and

changed makefiles to invoke maven to build java sources.

- More work to separate new UDR interface from older SPInfo object,

so that we can get rid of SPInfo if/when we don't support the older

style anymore.

- Small fix to odb makefile, make clean failed when executed twice.

Patch set 2: Adding a custom filter for test regress/udr/TEST108.

Change-Id: Ic827a42ac25505fb1ee451b79636c0f9349d8841

  1. … 96 more files in changeset.
DDL Tx, Changes to handle upsert using load operations on abort

SQL statements 'upsert using load' will be executed outside of a

transaction even though the operation can performed inside a

transaction. For this reason, it is necessary for us to handle this

data differently when a transaction aborts or fails.

These operations will be registered as part of the TmDDL manager

and if the transaction is aborted, that table gets truncated. The data

loaded will be deleted but table will still exist. At the moment, this is

handled by using disable, delete and create hbase methods. However, when

hbase is upgraded to the next version, we will use the truncate hbase

method option.

Change-Id: Ica9d81a284b5f1821a3048b9c8deaad449a4c4f4

  1. … 15 more files in changeset.
get indexLevel and blockSize from Hbase metadata to use in costing code.

Change-Id: I7b30364ec83a763d3391ddc39e12adec2ca1bd00

  1. … 7 more files in changeset.
Merge "Transactional DDL - Salted Tables functionality"

Transactional DDL - Salted Tables functionality

Added salted table functionality to transactional DDL create

It currently works with large numbers of table partitions and large

key lengths.

Change-Id: Ia5d9113678d697fdcd9f60021fc2dd3eb18fda0f

  1. … 21 more files in changeset.
Changes to enable Rowset select - Fix for bug 1423327

HBase always returns an empty result set when the row is not found. Trafodion

is changed to exploit this concept to project no data in a rowset select.

Now optimizer has been enabled to choose a plan involving Rowset Select

where ever possible. This can result in plan changes for the queries -

nested join plan instead of hash join,

vsbb delete instead of delete,

vsbb insert instead of regular insert.

A new CQD HBASE_ROWSET_VSBB_SIZE is now added to control the hbase rowset size.

The default values is 1000

Change-Id: Id76c2e6abe01f2d1a7b6387f917825cac2004081

  1. … 13 more files in changeset.
Merge "create meta column family for transaction data"

Fast Transport fix

fix for LP#1444575.

This checkin addresses an issue with size of the buffer where

we get the row before conveting to delimited format. The

buffer in this case is a single row buffer.

Change-Id: I33ad4bb0a5f2f84b8f56983b76b1b9ba73c9f6f6

  1. … 3 more files in changeset.
Merge "Changes to reduce the memory growth/leak in mxosrvr and T2 driver"

DDL Transactions,drop end to end & prepareCommit

Change-Id: I69b1d6b3babcaf61761c821bac091fd2101be729

  1. … 7 more files in changeset.
additional changes to support ALIGNED row format.

This feature is not externalized yet.

Change-Id: Idbf19022916d437bb7bb69019194de5057cbcb65

  1. … 21 more files in changeset.