ComSmallDefs.h

Clone Tools
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
jira TRAFODION-3157 Add support for BINARY/VARBINARY datatype

Spec attached to jira.

new test regress/seabase/TEST004 added.

  1. … 86 more files in changeset.
[TRAFODION-3234] Add support for hive partitioned tables

Refactored the interaction with hive metastore to use the efficient APIs to obtain the necessary

information to construct the hive table descriptors. This refactoring is expected to reduce the

memory requirements, efficient and high performing even when there are thousands of partitions.

  1. … 14 more files in changeset.
TRAFODION - 3218 User still has privilege after user's role has been revoked ...

Partial support for column level privileges with QI support for:

column select

column insert

column references

column update

Also, as part of this, updated privilege code in a couple of areas:

Changed object caching code in NATable and NARoutine to store all privileges

assigned to the object when the object is cached (privDescs_). During the load

operation, the code creates bitmaps (privInfo_) for the current user. Privilege

checks are performed against the user bitmaps (privInfo_). This is in

anticipation for some performance updates when connecting to Trafodion (mxosrvr)

with different users.

Change getRoleList to include the roleID and the granteeID that granted the

privilege. The grantee can be a user or a role.

When a privilege is revoked from a role, send QI keys for every user that has

been granted to role.

  1. … 40 more files in changeset.
[TRAFODION-2600] Unable to create view ... but user has SELECT privilege

Query invalidation is not resetting the role list when a user is granted a role.

For DML operations, we always retry the request once, and between retries, the

role list is reset. So DML works on a retry.

However, DDL operations are not retried, so the role list is not reset and the

create view fails.

An analogous issue exists when the role is revoked from a user and the role

list is not reset. In this case, the user can still create views even though

they no longer have the privilege.

Changes:

- Grant role: sends a new query invalidation key

- Revoke role: forces a query invalidation check even if the key is not present

- Displays query invalidation keys when debug option DBUSER_DEBUG is set, e.g:

set envvar DBUSER_DEBUG 1;

  1. … 11 more files in changeset.
take care of merge value

  1. … 4 more files in changeset.
Merge branch 'master' of git://git.apache.org/trafodion into TRAFODION-2335

  1. … 5 more files in changeset.
[TRAFODION-2335] support some functions as column default, part1 commit

  1. … 15 more files in changeset.
[TRAFODION-2974] Make event log reader and JDBC real TMUDFs

Roberta pointed out that we have two predefined UDFs, EVENT_LOG_READER

and JDBC, where the system administrator should have the ability to

control who can execute these functions.

To do this, these two UDFs cannot be "predefined" UDFs anymore, since

those don't have the metadata that's required for doing grant and

revoke.

Roberta also pointed out that the JDBC UDF should refuse to connect to

the T2 driver, for security reasons.

The fix leaves the predefined TMUDFs in place, for now, they will be

removed in R2.4 (see TRAFODION-2975).

The new "real" TMUDFs are in the "_LIBMGR_" schema, mostly for

convenience, as this schema has other UDFs that are created when

Trafodion is initialized.

  1. … 10 more files in changeset.
new COMMENT-ON SQL statement: resolve merge conflict

1. resolve enum type conflict while merging to esgyn

  1. … 1 more file in changeset.
new COMMENT-ON SQL statement: review change - error path

1. switch compile context in comment-on error path

2. correct spellings and others

  1. … 6 more files in changeset.
new COMMENT-ON SQL statement: review change - TEXT table modification

1. mv storage of COMMENTs in TEXT table

  1. … 10 more files in changeset.
TRAFODION-2731 CodeCleanup: Phase 2: Remove obsolete code

This phase handles the following:

-- removed files:

cli/rtdu.h, rtdu2.h, rtdu.cpp, rtdu.cpp

executor/dmeasql.h

executor/ExMeas.h, ExMeas.cpp

executor/tempfile.h, .cpp

executor/rcb.h

executor/stubs.cpp, stubs2.cpp

exp/srlversion.cpp

exp/exp_space.h

cli/VicKeyValuePair.h

cli/CliDll.cpp

cli/CliStubsStaticBuild.cpp

cli/globalsrlversion.cpp

cli/globalstubs.cpp

cli/sqlciSRLStubs.cpp

cli/test.cpp

cli/privsrlversion.cpp

common/SqlExpDllDefines.h

common/SqlExportDllDefines.h

sqlcat/enum.h

sqlcat/ReadTableDef.h, cpp

sqlcat/readRealArk.h, cpp

sqlshare/catapirequest.*

-- removed defines and code referencing them:

-- NA_STD_NAMESPACE

-- NA_NO_CMPCONTEXT

-- NA_CATMAN_SIM, NA_CATMAN_SIM_FS

-- common/purify.h

-- DONT_USE_MATH_H

-- NT_PORT

-- NA_MSVC

-- NA_NO_FRIENDS_WITH_TEMPLATE_REFS

-- NA_FLEXBUILD

-- removed multiple obsolete sqlci features and syntax:

(report writer, MACL, Help, Simulators, Utils, MXCS mode, Help,

and few others).

-- removed following files in sqlci dir:

CSInterface.h

CharSetConstants.cpp

CharSetConstants.h

MsgCat.cpp

MsgCat.h

MxciEHCallBack.cpp

MxciEHCallBack.h

RWInterface.cpp

RWInterface.h

SqlciCSCmd.cpp

SqlciCSCmd.h

SqlciCSSimulator.cpp

SqlciHelp.cpp

SqlciRWCmd.cpp

SqlciRWCmd.h

SqlciRWSimulator.cpp

SqlciUsage.cpp

SqlciUtil.cpp

SqlciUtil.h

UtilInt.cpp

UtilInt.h

UtilMsg.cpp

UtilMsg.h

immudefs.cpp

immudefs.h

  1. … 85 more files in changeset.
Ensured that CQD NUM_ESP_FRAGMENTS can take value upto 8. Removed the environment variable concept to set the number of esp fragments. Removed the CQDs ESP_NUM_FRAGMENTS_WITH_QUOTAS and ESP_MULTI_FRAGMENT_QUOTAS. Use the corresponding CQD ESP_NUM_FRAGMENTS and ESP_MUTLI_FRAGMENT instead (cherry picked from commit 72534e89633f3d4b8a61ab26d471aaa8b7f3e12a)

Conflicts:

core/sql/executor/ex_frag_rt.cpp

core/sql/executor/ex_frag_rt.h

core/sql/generator/GenRelMisc.cpp

  1. … 45 more files in changeset.
TRAFODION-2610 Register native HBase tables in traf MD plus couple more

-- support for registering hbase native cell/row format tables in traf MD

Syntax:

register [internal] hbase table [if not exists] <table-name>

unregister [internal] hbase table [if exists] <table-name>

-- support to get info related to registered hbase tables

Syntax: get hbase registered table in catalog trafodion

-- enhancements to HBase mapped tables and bug fixes

-- disabling of creation of external tables on native hbase

tables unless they are mapped tables

-- support for case-insensitive format string in TO_DATE function

(exp/exp_datetime.h)

-- regress/seabase/TEST022 enhanced with additional tests

  1. … 48 more files in changeset.
Fixed review comments

  1. … 5 more files in changeset.
[TRAFODION-2584] Add support to register hive objects in traf metadata

Syntax:

register hive table/view [if not exists] <object-name> [cascade]

unregister hive table/view [if exists] <object-name> [cascade]

cascade option: register/unregister all underlying objects that are

part of the specified view

update statistics, grant/revoke, traf views or external table creation

on hive objects will automatically and internally register those objects

in trafodion metadata.

invoke/showddl will show if this object is registered and whether

that registration was internal or explicit.

Get command extensions:

get hive registered tables/view/objects in catalog trafodion;

get hive external tables in catalog trafodion;

Cleanup command extensions:

cleanup metadata command will clean up inconsistent hive objects

(underlying hive object is missing but object is registered or

an external table exists)

cleanup [hive table | hive view] on <object-name>;

Existing hive objects that had implicit or explicit external tables

created prior to this checkin will have no change in behavior.

ObjectUID of those external tables will continue to be used for

relevant operations.

One can drop those external tables and explicitly register them,

or a subsequent operation(upd stats, grant, etc) that needs objectUID will

automatically register them.

minor changes based on review comments of previous checkin

get all objects command on hive metadata no longer fails.

get views on objects return 3-part name that could be used to

differentiate between a traf and hive view.

regress/hive/TEST007 has been extended.

TBD: Add register/unregister privileges

  1. … 52 more files in changeset.
Various fixes, details below

-- max length limited to 16777216 bytes (16M)

for char cols and functions (repeat, concat).

(optimizer/SynthType.cpp, common/ComSmallDefs.h, sqlcomp/nadefaults.cpp)

-- previous max length change requires adding of a new hbase property

called hbase.client.keyvalue.maxsize so large key/values could be

handled in hbase cell.

Following scripts have been updated to handle that.

(sqf/sql/scripts/install_local_hadoop,

install/ambari-installer/traf-mpack/common-services/TRAFODION/2.1/

service_advisor.py

install/installer/traf_hortonworks_mods

install/python-installer/configs/mod_cfgs.json)

Developers can also update hbase-site.xml with this property if

they dont want to reinstall local hadoop.

<property>

<name>hbase.client.keyvalue.maxsize</name>

<value>0</value>

</property>

-- while accessing a hive table as an external table, the hive table and

corresponding external table definitions are validated to be the

same. This validates that corresponding columns have the same

data attributes (type, length, scale, etc).

This check causes failures if hive column is of 'string' datatype.

That is because hive 'string' column length can be changed by

a cqd but the corresponding external table has predefined length

set when the table is created.

The validation check now ignores the length attribute if the hive column is

of 'string' datatype.

(optimizer/BindRelExpr.cpp, common/CharType.*, NAType.*)

-- data moved into direct buffer would sometimes cause overflow and crash.

The max direct buffer length used to send/retrieve hbase data

is now limited to 1G (executor/ExHbaseAccess.cpp)

-- errors during vsbb upsert are now handled correctly

(executor/ExHbaseIUD.cpp)

-- support for GET CATALOGS command

(generator/GenRelExeUtil.cpp, executor/ExExeUtilGet.cpp)

-- An incorrect computation would sometimes cause group by rollup to

crash compiler in NAHeap::unlinkLargeFragment()

(generator/GenRelGrby.cpp)

-- sort of a large row that exceeded sort pre-set buffer size would crash

if the row size exceeded the max buffer size.

That has been fixed by allocating space for atleast one row.

(generator/GenRelMisc.cpp)

-- purgedata on a delimited name was failing. That has been fixed.

(optimizer/RelExeUtil.h)

-- regress/tools/runregr_privs1/privs2 fixed to handle running of

subset of tests

-- regress/seabase/TEST031 updated with new tests

  1. … 30 more files in changeset.
Merge remote branch 'origin/master' into empty_lob_work

  1. … 22 more files in changeset.
Changes to support new syntax and implementation empty_blob/empty_clob for insert/updates Changes to support new syntax for updating a lob directly through a lobhandle without scanning the table.This is done via new ExeUtil operator. Changes to add columnname info to LOBMD_ table. This helps in mapping any given lobhandle back to the table and the column. Added new sections to the regression test executor/TEST130 to test these changes

  1. … 40 more files in changeset.
TRAFODION-1586 Add support for external Traf tables mappedd to HBase tables

This is the initial support for HBase mapped tables.

A doc for external manuals will be added later.

regress/seabase/TEST022 contains various testcases.

  1. … 77 more files in changeset.
TRAFODION-2363 Add support to return hive MD info in relational format

  1. … 21 more files in changeset.
[TRAFODION-2317] Infrastructure for common subexpressions

This is a first set of changes to allow us to make use of CTEs

(Common Table Expressions) declared in WITH clauses and to create

a temp table for them that is then read multiple times in the query.

This also includes a fix for

[TRAFODION-2280] Need to remove salt columns from uniqueness constraints

Summary of changes:

- Adding a unique statement number in CmpContext

- Moving the execHiveSQL method from the OSIM code to CmpContext

- Adding a list of common subexpressions and their references

to CmpStatement

- Adding the ability to the Hive Truncate operator to drop the

table when the TCB gets deallocated

- Adding the ability to the HDFS scan to compute scan ranges at

runtime. Those are usually determined in the compiler. This is

only supported for simple, non-partitioned, delimited tables.

We need this because we populate the temp table and read from

in in the same statement, without the option of compiling

after we inserted into the temp table.

- Special handling in the MapValueIds node of common subexpressions.

See the comment in MapValueId::preCodeGen().

- Moved the binder code to create a FastExtract node into a new

method FastExtract::makeFastExtractTree(), to be able to call

it from another place.

- MapValueIds no longer looks at the "used by MVQR flag" to determine

the method for VEGRewrite. Instead it checks whether a list of

values has been provided to do this.

- Adding a new method, RelExpr::prepareMeForCSESharing, that is

kind of an "unnormalizer", undoing some of the normalizer

transformations.

- Implementing the steps for common subexpression materializations

described below.

- Adding the ability to suppress the Hive timestamp modification

check when truncating a Hive table

- Adding an optimizer rule to eliminate CommonSubExprRef nodes.

These nodes should not normally survive past the SQO phase, but

if the SQO phase gets interrupted by an exception, that could

happen, since we then fall back to a copy of the tree before

SQO. In the future, we can consider cost-based decision on

what to do with common subexpressions.

- Adding CommonSubExprRef nodes in the parser whenever we expand

a CTE reference.

- Adding cleanup code to the "cleanup obsolete volatile tables"

command that removes obsolete Hive tables used for common

subexpressions.

Other changes contained in this change set:

- Optimization for empty scans, like select * from t where 1=0

This now generates a cardinality constraint with 0 rows, which

can be used later to eliminate parts of a tree.

(file OptLogRelExpr.cpp)

- [TRAFODION-2280] Need to remove salt columns from uniqueness

constraints generated on salted tables.

(file OptLogRelExpr.cpp)

- Got rid of the now meaningless "seamonster" display in EXPLAIN.

(file GenExplain.cpp and misc. expected files)

- Suppress display of "zombies" in the cstat command. Otherwise,

these zombies (marked as <defunct>) prevent Trafodion from

starting, because they are incorrectly considered "orphan"

processes. This could require a reboot when no reboot is necessary.

(file core/sqf/sql/scripts/pstat)

Incomplete list of things left to be done:

- TRAFODION-2316: Hive temp tables are not secure. Use volatile

tables instead.

- TRAFODION-2315: Add heuristics to decide when to use the temp table

approach.

- TRAFODION-2320: Make subquery unnesting work with common subexpressions.

Generated Plans

---------------

The resulting query plan for a query Q with n common

subexpressions CSE1 ... CSEn looks like this:

Root

|

MapValueIds

|

BlockedUnion

/ \

Union Q

/ \

... CTn

/

Union

/ \

CT1 CT2

Each of the CTi variables looks like the following, an

INSERT OVERWRITE TABLE tempi ...

BlockedUnion

/ \

Truncate FastExtract TEMPi

TEMPi |

CSEi

The original query Q has the common subexpressions replaced

with the following:

MapValueIds

|

scan TEMPi

Here is a simple query and its explain:

prepare s from

with cse1 as (select d_date_sk, d_date, d_year, d_dow, d_moy from date_dim)

select x.d_year, y.d_date

from cse1 x join cse1 y on x.d_date_sk = y.d_date_sk

where x.d_moy = 3;

>>explain options 'f' s;

LC RC OP OPERATOR OPT DESCRIPTION CARD

---- ---- ---- -------------------- -------- -------------------- ---------

11 . 12 root 1.46E+005

5 10 11 blocked_union 1.46E+005

7 9 10 merge_join 7.30E+004

8 . 9 sort 1.00E+002

. . 8 hive_scan CSE_TEMP_CSE1_MXID11 1.00E+002

6 . 7 sort 5.00E+001

. . 6 hive_scan CSE_TEMP_CSE1_MXID11 5.00E+001

1 4 5 blocked_union 7.30E+004

2 . 4 hive_insert CSE_TEMP_CSE1_MXID11 7.30E+004

. . 2 hive_scan DATE_DIM 7.30E+004

. . 1 hive_truncate 1.00E+000

--- SQL operation complete.

>>

CQDs to control common subexpressions

-------------------------------------

CSE_FOR_WITH is the master switch.

CQD Value Default Behavior

--------------------- --------- ------- ---------------------------------------

CSE_FOR_WITH OFF Y No change

ON Insert a CommonSubExprRef node in the

tree whenever we reference a CTE

(table defined in a WITH clause)

CSE_USE_TEMP OFF Y Disable creation of temp tables

for common subexpressions

SYSTEM Same as OFF for now

ON Always create a temp table for

common subexpressions where possible

CSE_DEBUG_WARNINGS OFF Y No change

ON Emit diagnostic warnings that show why

we didn't create temp tables for

common subexpressions

CSE_CLEANUP_HIVE_TABLES OFF Y No change

ON Cleanup Hive tables used for CSEs with

the "cleanup obsolete volatile tables"

command

CommonSubExprRef relational operators

-------------------------------------

This is a new RelExpr class that is introduced. It marks the common

subexpressions in a RelExpr tree. This operator remembers the name of

a common subexpression (e.g. the name used in the WITH clause).

Multiple such operators can reference to the same name. Each of

these references has a copy of the tree.

Right now, these operators are created in the parser when we expand a

CTE (Common Table Expression), declared in a WITH clause. If the CTE

is referenced only once, then the CommonSubExprRef operator is removed

in the binder - it also doesn't live up to its name in this case.

The remaining CommonSubExprRef operators keep track of various changes

to their child trees, during the binder and normalizer phases. In

particular, it tracks which predicates are pushed down into the child

tree and which outputs are eliminated.

The CmpStatement object keeps a global list of all the

CommonSubExprRef operators in a statement, so the individual operators

have a way to communicate with their siblings:

- A statement can have zero or more named common subexpressions.

- Each reference to a common subexpression is marked in the RelExpr

tree with a CommonSubExprRef node.

- In the binder and normalizer, common subexpressions are expanded,

meaning that multiple copies of them exist, one copy per

CommonSubExprRef.

- Common subexpressions can reference other common subexpressions,

so they, together with the main query, for a DAG (directed

acyclic graph) of dependencies.

- Note that CTEs declared in a WITH clause but not referenced are

ignored and are not part of the query tree.

In the semantic query optimization phase (SQO), the current code makes

a heuristic decision what to do with common subexpressions - to

evaluate them multiple times (expand) or to create a temporary table

once and read that table multiple times.

If we decide to expand, the action is simple: Remove the

CommonSubExprRef operator from the tree and replace it with its child.

If we decide to create a temp table, things become much more difficult.

We need to do several steps:

- Pick one of the child trees of the CommonSubExprRefs as the one to

materialize.

- Undo any normalization steps that aren't compatible with the other

CommonSubExprRefs. That means pulling out predicates that are not

common among the references and adding back outputs that are

required by other references. If that process fails, we fall back

and expand the expressions.

- Create a temp table tmp.

- Prepare an INSERT OVERWRITE TABLE tmp SELECT * FROM cse tree

that materializes the common subexpression in a table.

- Replace the CommonSubExprRef nodes with scans of the temp table.

- Hook up the insert query tree with the main query, such that it

is executed before the main query starts.

Temporary tables

----------------

At this time, temporary tables are created as Hive tables, with a

fabricated, unique name, including the session id, a unique statement

number within the session, and a unique identifier of the common

subexpression within the statement. The temporary table is created at

compile time. The query plan contains an operator to truncate the

table before populating it. The "temporary" Hive table is dropped when

the executor TCB is deallocated.

Several issues are remaining with this approach:

- If the process exits before executing and deallocating the statement,

the Hive table is not cleaned up.

Solution (TBD): Clean up these tables like we clean up left-over

volatile tables. Both are identified by the session id.

- If the executor runs into memory pressure and deallocates the TCB,

then allocates it again at a later time, the temp table is no longer

there.

Solution (TBD): Use AQR to recompile the query and create a new table.

- Query cache: If we cache a query, multiple queries may end up with

the same temporary table. This works ok as long as these queries are

executed serially, but it fails if both queries are executed at the

same time (e.g. open two cursors and fetch from both, alternating).

Solution (TBD): Add a CQD that disables caching queries with temp tables

for common subexpressions.

In the future we also want to support volatile tables. However, those also

have issues:

- Volatile tables aren't cleaned up until the session ends. If we run

many statements with common subexpressions, that is undesirable. So,

we have a similar cleanup problem as with Hive tables.

- Volatile tables take a relatively long time to create.

- Insert and scan rates on volatile Trafodion tables are slower than

those on Hive tables.

To-do items are marked with "Todo: CSE: " in the code.

  1. … 78 more files in changeset.
[TRAFODION-2282][TRAFODION-882] New scheme to invalidate histogram cache

  1. … 27 more files in changeset.
[TRAFODION-2187] Fix DROP SCHEMA CASCADE when sample tables are present

  1. … 5 more files in changeset.
Merge branch 'master' into trafodion-1788

Conflicts:

core/sql/regress/privs1/EXPECTED141

  1. … 7 more files in changeset.
[TRAFODION-2140] Move IUS feature to open source

  1. … 35 more files in changeset.
Merge branch 'master' into trafodion-1788

Conflicts:

core/sql/common/ComSmallDefs.h

core/sql/generator/Generator.cpp

core/sql/optimizer/NATable.cpp

core/sql/sqlcat/ReadTableDef.cpp

core/sql/sqlcat/desc.h

  1. … 20 more files in changeset.
metadata access performance improvement for accessing traf objects

Packed version of descriptor structures is now stored in traf TEXT

table during various DDL operations.

During dml, if packed descriptors exist, then they are read and used.

If they dont exist, then descriptors are generated on the fly.

Descriptor classes are defined in file sqlcat/TrafDDLdesc.h and

are based off the older desc structures from desc.h.

These are simplified and class'ified version of the older structures

and are versioned. Changes to these classes should be done carefully

by either using filler bytes or versioning them.

2 new cqds, traf_store_object_descriptor and traf_read_object_descriptor,

are added to control if packed descs should be stored/read.

These are currently set to OFF by default but are enabled during

regressions run. Once this feature is tested, these cqds will be turned

on by default.

In addition, some helper commands have been added to manipulate descrs.

Syntax:

alter table t [check|generate|delete|disable|enable] stored descritors

If authorization is enabled, then auth/priv specific info

is generated on the fly and currently not stored in metadata.

In a later checkin, these will also be stored in metadata.

  1. … 74 more files in changeset.
[TRAFODION-1882]: Column Privilege: a user can grant column privilege to ... [TRAFODION-1788]: Grant and Revoke on table columns with referencing views ...

The main issue is that the view-col <=> referenced-col usages were not available

from the metadata.

-- create view was changed to add view-col <=> referenced-col usages to the

TEXT table. This allows NATable and privilege management to retrieve this

information. No upgrade is required

-- Privilege management was changed to see if views could still exist based

only on column level privileges.

-- Grants and revokes on referenced columns for objects are now propagated to

to referencing views

-- Fixed an issue during column grants where column ordinals were not checked

correctly [TRAFODION-1882]

-- Fixed an issue where DB__ROOT, acting as schema owner, was able to grant

privileges that the schema owner was not able to grant,

  1. … 15 more files in changeset.
Merge in latest origin/master

  1. … 4 more files in changeset.