ExHbaseSelect.cpp

Clone Tools
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
initial support for returning multiple versions and column timestamps

This feature is not yet externalized.

Support added to:

-- return multiple versions of rows

-- select * from table {versions N | MAX | ALL}

-- get hbase timestamp of a column

-- select hbase_timestamp(col) from t;

-- get version number of a column.

-- select hbase_version(col) from t

Change-Id: I37921681fc606a22c19d2c0cb87a35dee5491e1e

  1. … 48 more files in changeset.
Move core into subdir to combine repos

  1. … 10768 more files in changeset.
Move core into subdir to combine repos

  1. … 10622 more files in changeset.
Move core into subdir to combine repos

Use: git log --follow -- <file>

to view file history thru renames.

  1. … 10837 more files in changeset.
Changes to enable Rowset select - Fix for bug 1423327

HBase always returns an empty result set when the row is not found. Trafodion

is changed to exploit this concept to project no data in a rowset select.

Now optimizer has been enabled to choose a plan involving Rowset Select

where ever possible. This can result in plan changes for the queries -

nested join plan instead of hash join,

vsbb delete instead of delete,

vsbb insert instead of regular insert.

A new CQD HBASE_ROWSET_VSBB_SIZE is now added to control the hbase rowset size.

The default values is 1000

Change-Id: Id76c2e6abe01f2d1a7b6387f917825cac2004081

  1. … 19 more files in changeset.
Merge "Snapshot Scan changes"

  1. … 5 more files in changeset.
Snapshot Scan changes

The changes in this delivery include:

-decoupling the snapshot scan from the bulk unload feature. Setup of the

temporary space and folders before running the query and cleanup afterwards

used to be done by the bulk unload operator because snapshot scan was specific

to bulk unload. In order the make snapshot scan indepenednt from bulk unload

and use it in any query the setup and cleanup tasks are now done by the query

itself at run time (the scan and root operators).

-caching of the snapshot information in NATable to optimize compilation time

Rework for chaching: when the user sets TRAF_TABLE_SNAPSHOT_SCAN to LATEST

we flush the metadata and then we set the caching back to on so that metadata

get cached again. If newer snapshots are created after setting the cqd they

won't be seen if they are already cached unless the user issue a command/cqd

to invalidate or flush the cache. One way for doing that can be to issue

"cqd TRAF_TABLE_SNAPSHOT_SCAN 'latest';" again

-code cleanup

below is a description of the CQds used with snapshot scan:

TRAF_TABLE_SNAPSHOT_SCAN

this CQD can be set to :

NONE--> (default)Snapshot scan is disabled and regular scan is used ,

SUFFIX --> Snapshot scan is enabled for the bulk unload (bulk unload

behavior is not changed)

LATEST --> Snapshot Scan is enabled independently from bulk unload and

the latest snapshot is used if it exists. If no snapshot exists

the regular scan is used. For this phase of the project the user

needs to create the snapshots using hbase shell or other tools.

And in the next phase of the project new comands to create,

delete and manage snapshots will be add.

TRAF_TABLE_SNAPSHOT_SCAN_SNAP_SUFFIX

This CQD is used with bulk unload and its value is used to build the

snapshot name as the table name followed by the suffix string

TRAF_TABLE_SNAPSHOT_SCAN_TABLE_SIZE_THRESHOLD

When the estimated table size is below the threshold (in MBs) defined by

this CQD the regular scan is used instead of snapshot scan. This CQD

does not apply to bulk unload which maintains the old behavior

TRAF_TABLE_SNAPSHOT_SCAN_TIMEOUT

The timeout beyond which we give up trying to create the snapshot scanner

TRAF_TABLE_SNAPSHOT_SCAN_TMP_LOCATION

Location for temporary links and files produced by snapshot scan

Change-Id: Ifede88bdf36049bac8452a7522b413fac2205251

  1. … 44 more files in changeset.
Adding more run-time memory allocations from NAHeap

This set of changes moves some of the string vector variables in HBase

access operators from standard string template to our NAList and

NAString (or HbaseStr for row IDs). In the process, allocationis of the

objects will be from our HAHeap instead of the system heap. This would

help us tracking memory usage and detecting leaks easier.

In addition, a change in ExHbaseAccessTcb::setupListOfColNames()

prevents unnecessary allocations to populate the columns list unless it

is empty. The Google profiling tools helped us on identifying this

problem.

also, removed ExHbaseAccessDeleteTcb operator which was not used.

Change-Id: I87ab674ab8e3d291f2fc9563718d88de537ae96b

  1. … 10 more files in changeset.
Bulk unload optimization using snapshot scan

resubmitting after facing git issues

The changes consist of:

*implementing the snapshot scan optimization in the Trafodion scan operator

*changes to the bulk unload changes to use the new snapshot scan.

*Changes to scripts and permissions (using ACLS)

*Rework based on review

Details:

*Snapshot Scan:

----------------------

**Added support for snapshot scan to Trafodion scan

**The scan expects the hbase snapshots themselves to be created before running

the query. When used with bulk unload the snapshots can created by bulk unload

**The snapshot scan implementation can be used without the bulk-unload. To use

the snapshot scan outside bulk-unload we need to use the below cqds

cqd TRAF_TABLE_SNAPSHOT_SCAN 'on'; --

-- the snapshot name will the table name concatenated with the suffix-string

cqd TRAF_TABLE_SNAPSHOT_SCAN_SNAP_SUFFIX 'suffix-string';

-- temp dir needed for the hbase snapshotsca

cqd TRAF_TABLE_SNAPSHOT_SCAN_TMP_LOCATION '/bulkload/temp_scan_dir/'; n

**snapshot scan can be used with table scan, index scans etc…

*Bulk unload utility :

-------------------------------

**The bulk unload optimization is due the newly added support for snapshot scan.

By default bulk unload uses the regular scan. But when snapshot scan is

specified it will use snapshot scan instead of regular scan

**To use snapshot scan with Bulk unload we need to specify the new options in

the bulk unload syntax : NEW|EXISTING SNAPHOT HAVING SUFFIX QUOTED_STRING

***using NEW in the above syntax means the bulk unload tool will create new

snapshots while using EXISTING means bulk unload expect the snapshot to

exist already.

***The snapshot names are based on the table names in the select statement. The

snapshot name needs to start with table name and have a suffix QUOTED-STRING

***For example for “unload with NEW SNAPSHOT HAVING SUFFIX ‘SNAP111’ into ‘tmp’

select from cat.sch.table1; “ the unload utiliy will create a snapshot

CAT.SCH.TABLE1_SNAP111; and for “unload with EXISTING SNAPSHOT HAVING SUFFIX

‘SNAP111’ into ‘tmp’ select from cat.sch.table1; “ the unload utility will

expect a snapshot CAT.SCH.TABLE1_SNAP111; to be existing already. Otherwise

an error is produced.

***If this newly added options is not used in the syntax bulk unload will use

the regular scan instead of snapshot scan

**The bulk unload queries the explain plan virtual table to get the list of

Trafodion tables that will be scanned and based on the case it either creates

the snapshots for those tables or verifies if they already exists or not

*Configuration changes

--------------------------------

**Enable ACLs in hdfs

**

*Testing

--------

**All developper regression tests were run and all passed

**bulk unload and snapshot scan were tested on the cluster

*Examples:

**Example of using snapshot scan without bulk unload:

(we need to create the snapshot first )

>>cqd TRAF_TABLE_SNAPSHOT_SCAN 'on';

--- SQL operation complete.

>>cqd TRAF_TABLE_SNAPSHOT_SCAN_SNAP_SUFFIX 'SNAP777';

--- SQL operation complete.

>>cqd TRAF_TABLE_SNAPSHOT_SCAN_TMP_LOCATION '/bulkload/temp_scan_dir/';

--- SQL operation complete.

>>select [first 5] c1,c2 from tt10;

C1 C2

--------------------- --------------------

.00 0

.01 1

.02 2

.03 3

.04 4

--- 5 row(s) selected.

**Example of using snapshot scan with unload:

UNLOAD

WITH PURGEDATA FROM TARGET

NEW SNAPSHOT HAVING SUFFIX 'SNAP778'

INTO '/bulkload/unload_TT14_3' select * from seabase.TT20 ;

Change-Id: Idb1d1807850787c6717ab0aa604dfc9a37f43dce

  1. … 35 more files in changeset.
Reworked fix for LP bug 1404951

The scan cache size for an mdam probe is now set to the hbase default of 100.

Setting it values like 1 or 2 resulted in intermittent failures. The cqd

COMP_BOOL_184 can be set ON to get a cache size of 1 for mdam probes.

Root cause for this intermittent failure will be investigated later.

Change-Id: Ic05a77ecb0deeb260784f156de251a0f0dbdf49c

  1. … 5 more files in changeset.
Fix for LP bug 1404951

This is an intermittent problem that appears on the build machine,

caused by Change-Id I5b570c42712d4c38157181c3b76bf9a3ab6e2ed9. In this

delivery we are increasing the number of rows fetched by hbase scan call

to two rows, up from the previous value of one row. This is only for the

scan call used to determine mdam probe keys.

Change-Id: I12f20084bf53c188000db24ed8a698b3fdc7f41b

Merge "Fix for Mdam access causes large number of rows to be accessed."

SQL syntax to cancel executing query, phase 3

This change fixes some problems with subset DELETE and

UPDATE statements which prevented them from responding

to CANCEL. It addresses an identical potential issue in

SELECT statements with predicates that reject large

numbers of rows.

The change also allows an envvar, SQL_NO_REGISTER_CANCEL,

which if set to 1, prevents queries from registering with

the cancel broker. It can be used to debug performance

regressions.

The change also adds test cases to the regression test

for UPDATE, DELETE, INSERT and UPSERT WITH LOAD.

Change-Id: I86977c3985db4f56f2d4a0e89051970cec2c9411

Implements: blueprint sql-query-cancel

  1. … 6 more files in changeset.
Fix for Mdam access causes large number of rows to be accessed.

A fix by Dave Birdsall and Anoop Sharma on a problem where mdam probe was

causing large number of rows to be accessed. The issue was that the scan cache

was being set to 10000 based on cardinality estimates, but the mdam probe

is being to retrieve only one row at most. Additinal changes to improve

debugging with mdam predicate network.

Dummy delta change to get check tests to run again.

Change-Id: I5b570c42712d4c38157181c3b76bf9a3ab6e2ed9

  1. … 2 more files in changeset.
LOB datatype infrastructure support

Technology preview.

More changes expected as part of this work before it

is user ready.

blueprint lob-support

This checkin contains basic support for create blob/clob datatypes.

The feature is disabled by default. Instructions on how to enable are

listed below.

New test executor/TEST130 that turns the feature on and tests out the

functionality.

New mxlobsrvr process will be started as part of sqstart.

Create and drop of tables with LOB columns.

No support for alter.

DML support for LOB datatypes.

Insert, update and deletes. Joins of 2 tables with LOB columns are

allowed but joins on LOB columns temselves are not allowed.

Insert-select from one LOB table another not yet supported.

Link to document from the blueprint will be added shortly.

To enable and try LOBs:

On a developer workstation :

cqd TRAF_BLOB_AS_VARCHAR 'OFF';

On a cluster after installing the code 2 steps are needed:

1. cqd TRAF_BLOB_AS_VARCHAR 'OFF';

2. sudo su hdfs --command "hadoop fs -mkdir /lobs"

sudo su hdfs --command "hadoop fs -chown -R trafodion:trafodion

/lobs"

This checkin includes several merges from the mainline and each of

the lines below represents one commit to the project branch where this

work was done. `

-Turning off LOB code by default. But turning it on in

executor/TEST130 just to ensure testing the code path.

-Support for showddl and some syntax for external files and stream.

-LOB regression test

-Workaround for dtm issue LP 378167

-Changes to make append work. Changes to use lob heap.

-Fix for using system heap for all LOB allocations and handling NULLs.

-Added workaround for cursor delete issue. LP 1376969

-Fixes for update.

-Parser changes for exe_util_lob_extract

-Pull in lob extract code

-Adding mxlobsrvr directory

-Fixed the LOB interface to use 2 new params for cursor fetches. They do

not overload the LOB Handle and LOB handle length anymore.

Added a flag to lobGlobals to indetify it's a hive access.

Cleaned up parser code.

-LOB support for create,drop,insert,delete,select.

Change-Id: I7c8125696e847b71580b746388632e75741bd347

  1. … 52 more files in changeset.
Native external hbase table access (select, IUD) changes.

-- IUD on external hbase tables is now enabled by default

-- predicates on native hbase tables can now be pushed down to

hbase region server

-- traf varchar col maxlength is now 200K by default,

can be changed by cqd max_character_col_size

-- executor handles column values length greater than 32K during

move to/from JNI

-- error is correctly returned if data retrieved from hbase exceeds expected

max row length

-- hbase column_create function now takes an expression/param as its

column name operand

Change-Id: Ieb3fcabfebaa22008eff2a049fc1e2000e68861e

  1. … 46 more files in changeset.
Enabling runtime stats for hbase tables and operators

This is the third set of changes to collect the runtime stats info. Part

is to address the comments and suggestions from last review.

1) Instead of passing the hbase access stats entry to every htable

calls, set the pointer in the EXP hbase interface layer with first init

call in the tcb work methods (not the task work methods), then

eventually to the htable JNI layer from getHTableClient()

(sql/exp/ExpHbaseInterface.cpp).

2) Rewrite the way to construct the hbase operator names from one

methord and use it for display both tdb contents and tcb stats.

3) Populate the hbase I/O bytes counter for both read and insert

(sql/executor/HBaseClient_JNI.cpp).

4) Fix the problem that parsing stats variable text string could go

beyond the end of the string (getSubstrInfo() in

sql/executor/ExExeUtilGetStats.cpp).

Change-Id: I62618b57894039bc1ca5bc0f3c9b89efec5cc42e

  1. … 15 more files in changeset.
Enabling runtime stats for hbase tables and operators

This is the second set of changes to collect the runtime stats info for

hbase tables and operators. It contains:

1) Stats for hbase IUD operations

2) Moved incActualRowsReturned() call to

ExHbaseAccessTcb::moveRowToUpQueue()

3) Added Hbase call counter

4) Display full hbase operator names instead of generic

"EX_HBASE_ACCESS" for hbase operator runtime stats

Change-Id: I94d727c897876a429b588f9acb3fec465dd56fe5

  1. … 11 more files in changeset.
Enabling runtime stats for hbase operators

This is the first set of changes to collect the runtime stats info for

hbase tables and operators. It contains:

1) Populate the estimated row count in hbase access TDB.

2) Collect the hbase access time and accessed row count at the JNI layer

(only for select operations now).

Partially reviewed by Mike H. and Selva G.

Removed the part that devides the estimated rows by number of ESPs based on the comments

Change-Id: I5a98a8ae9c4462aa53ad889edfe4cd8563502477

  1. … 11 more files in changeset.
Native Hbase access improvments

Native hbase access via Trafodion SQL engine now utilizes the enhancements

made for Trafodion table like pushing mutliple rows to JNI and pre-fetch.

Also removed the dependency on the ResultIterator for the native hbase access.

This reduces yet another object created in HTableClient and improves java

memory usage on the client side.

Cleaned up state changes in the HBase access operators and removed

redundant code in the JNI layer.

Change-Id: I70ab52917aac64b68b3816b8ad834842a4d8745e

  1. … 8 more files in changeset.
Pre-fetch cells from Hbase

Pre-fetch is enabled via a parameter in HTableClient.startScan method.Pre-fetch

is not done for unique and batch Trafodion operations and all native

Hbase table access. Pre-fetch is currently disabled for non-unique UMD

Trafodion operations.

startScan method invokes pre-fetch to Hbase in a different thread. When the

fetchRows method is called, pre-fetch completes, passes cell info to JNI and

invokes pre-fetch if there are more rows to be fetched.

We have observed around 45% reduction in response time to fetch 12 million

rows of a sizteen partition table in a node via a single process.

Change-Id: I3c81e182663fddd08a2fc873a39302b179850c92

  1. … 6 more files in changeset.
Cleanup scan related functions

Incorporated the comments from change id 331

Changed ExpHbaseInterface_JNI::fetchAllRows method to use the new scan methods

Removed the scan related methods that are no longer used

Triggered the cleanup of Java objects at the time of releaseHTableClient.

Increased the default maximum java heap size to 1024MB from 512MB.

Change-Id: I851bcfa266504f609fdbcba6f2a5e9e6dd2937d3

  1. … 9 more files in changeset.
Reducing the path length for scan, single row get, batch get operations

Earlier, Key-values per row is buffered as a ByteBuffer and shipped to JNI side.There were 2 C->java transitions to get the Hbase row into tuple format.

Now, we avoid this copy on the Java side and ship the length and offsets of

the Key-Value for all rows in a batch along with reference to the

Key-Value buffer. The data is directly moved from Hbase Buffers to

row format buffer. There are only 2 C->java transitions per batch of row.

This has shown 3-4 times reduction in path length in Trafodion SQL processes

while scanning a table with 12 million rows.

Trafodion SQL processes dump core when ex_assert function is called or when it

exits with non-zero value. These processes will also dump core when an

environment variable ABORT_ON_ERROR is set in release mode.

Fix for seabase/TEST010 failure with the patchSet 1

Rowset Select was dumping core sometimes. A wrong tuple was used to calculate the row length. It caused the proces to dump core sometimes.

Change-Id: I0ec4669b54971b6a0c699c0fa0662c85f68bd25d

  1. … 13 more files in changeset.
Various Launchpad and other fixes.

-- metadata and statistics tables will no longer be created with

serialization attribute even if cqd hbase_serialization is set to ON.

also, reenabled HBASE_SERIALIZATION for regressions run

(sqlcomp/CmpSeabaseDDLcommon.cpp, regress/tools/sbdefs)

-- rowwise hbase rows from native hbase tables are now being created correctly

in all cases.

executor/ExHbaseAccess.*

exp/exp_function.*, optimizer/BindItemExpr.cpp, ItemExpr.cpp

-- IUD and SELECT execution state is now being correctly initialized at the

beginning of a run. Multiple executions were failing otherwise.

(executor/ExHbaseIUD.cpp, ExHbaseSelect.cpp)

-- sign is now allowed in an interval literal

(generator/GenItemFunc.cpp, GenRelScan.cpp, ItemFunc.h, ValueDesc.cpp)

-- location value being returned from updates was not being set correctly in

some cases. That has been fixed

(generator/GenRelUpdate.cpp)

-- self referencing updates were not returning the right values due to

halloween issue. It is fixed by transforming it to insert/delete.

(optimizer/BindRelExpr.cpp)

-- purgedata now returns an error if issued on hbase, hive, neoview tables,

or on a view.

(optimizer/RelExeUtil.cpp, sqlcomp/CmpSeabaseDDLcommon.cpp)

-- referencing and referenced columns in a foreign key are now enforced

to have the same datatype attributes

(sqlcomp/CmpSeabaseDDLtable.cpp)

-- drop schema now works with delimited schema names

-- in some cases, a create constraint failure was not dropping the table on

which the constraint was being created.

that has been fixed.

(sqlcomp/CmpSeabaseDDLtable.cpp)

-- some additional infra changes for Traf as a mysql storage engine.

(cli/*, executor/ExExeUtilCli.*)

Change-Id: I94d5eb13c826efdf44ba10c04ac52a671f86553e

  1. … 23 more files in changeset.
Update Statistics performance improved by sampling in HBase

Update Statistics is much slower on HBase tables than it was for

Seaquest. A recent performance analysis revealed that much of the

deficit is due to the time spent retrieving the data from HBase that

is used to derive the histograms. Typically, Update Statistics uses a

1% random sample of a table’s rows for this purpose. All rows of the

table are retrieved from HBase, and the random selection of which rows

to use is done in Trafodion.

To reduce the number of rows flowing from Hbase to Trafodion for queries

using a SAMPLE clause that specifies random sampling, the sampling logic

was pushed into the HBase layer using a RandomRowFilter, one of the

built-in filtersrovided by HBase. In the typical case of a 1% sample,

this reduces the number of rows passed from HBase to Trafodion by 99%.

After the fix was implemented, Update Stats on various tables was 2 to 4

times faster than before, when using a 1% random sample.

Change-Id: Icd40e4db1dde444dec76165c215596755afae96c

  1. … 13 more files in changeset.
Changes to improve IUD statement performance

Change Owner: Selvaganesan Govindarajan

Reviewer: Mike Hanlon

Summary of change:

Avoids intermediate Thrift objects for IUD statements to improve its

performance.

Changed the default max heap size for java objects to be 512mb

Change-Id: Ib50734f82afcf54c1dec6c182c2f936c3de1c18a

  1. … 8 more files in changeset.
Changes to improve the performance of select statement

Change Owner:Selvaganesan Govindarajan

Change Reviewer: Mike Hanlon, Khaled Bouaziz

Summary of change: Changes to eliminate creation of the intermediate objects

to project the rows and instead project the rows from the raw trafodion format

to the tuple format of our SQL engine.

Fix in ExHbaseAccessTcb::getColPos when the fetched column name goes out of

sync with fetched data. Earlier, this resulted in core dump.

Change-Id: I646a9f0986de7fd562145f3746701c40d7cfb2f1

  1. … 7 more files in changeset.
Code Drop Update - 5/23/14

Change-Id: If478e8857cbfa9652227af7ed83cd61dd075a889

  1. … 163 more files in changeset.
Initial code drop of Trafodion

    • -0
    • +1819
    ./ExHbaseSelect.cpp
  1. … 4886 more files in changeset.