[TRAFODION-2317] Infrastructure for common subexpressions This is a first set of changes to allow us to make use of CTEs (Common Table Expre…
[TRAFODION-2317] Infrastructure for common subexpressionsThis is a first set of changes to allow us to make use of CTEs(Common Table Expressions) declared in WITH clauses and to createa temp table for them that is then read multiple times in the query.This also includes a fix for[TRAFODION-2280] Need to remove salt columns from uniqueness constraintsSummary of changes:- Adding a unique statement number in CmpContext- Moving the execHiveSQL method from the OSIM code to CmpContext- Adding a list of common subexpressions and their references to CmpStatement- Adding the ability to the Hive Truncate operator to drop the table when the TCB gets deallocated- Adding the ability to the HDFS scan to compute scan ranges at runtime. Those are usually determined in the compiler. This is only supported for simple, non-partitioned, delimited tables. We need this because we populate the temp table and read from in in the same statement, without the option of compiling after we inserted into the temp table.- Special handling in the MapValueIds node of common subexpressions. See the comment in MapValueId::preCodeGen().- Moved the binder code to create a FastExtract node into a new method FastExtract::makeFastExtractTree(), to be able to call it from another place.- MapValueIds no longer looks at the "used by MVQR flag" to determine the method for VEGRewrite. Instead it checks whether a list of values has been provided to do this.- Adding a new method, RelExpr::prepareMeForCSESharing, that is kind of an "unnormalizer", undoing some of the normalizer transformations.- Implementing the steps for common subexpression materializations described below.- Adding the ability to suppress the Hive timestamp modification check when truncating a Hive table- Adding an optimizer rule to eliminate CommonSubExprRef nodes. These nodes should not normally survive past the SQO phase, but if the SQO phase gets interrupted by an exception, that could happen, since we then fall back to a copy of the tree before SQO. In the future, we can consider cost-based decision on what to do with common subexpressions.- Adding CommonSubExprRef nodes in the parser whenever we expand a CTE reference.- Adding cleanup code to the "cleanup obsolete volatile tables" command that removes obsolete Hive tables used for common subexpressions.Other changes contained in this change set:- Optimization for empty scans, like select * from t where 1=0 This now generates a cardinality constraint with 0 rows, which can be used later to eliminate parts of a tree. (file OptLogRelExpr.cpp)- [TRAFODION-2280] Need to remove salt columns from uniqueness constraints generated on salted tables. (file OptLogRelExpr.cpp)- Got rid of the now meaningless "seamonster" display in EXPLAIN. (file GenExplain.cpp and misc. expected files)- Suppress display of "zombies" in the cstat command. Otherwise, these zombies (marked as <defunct>) prevent Trafodion from starting, because they are incorrectly considered "orphan" processes. This could require a reboot when no reboot is necessary. (file core/sqf/sql/scripts/pstat)Incomplete list of things left to be done:- TRAFODION-2316: Hive temp tables are not secure. Use volatile tables instead.- TRAFODION-2315: Add heuristics to decide when to use the temp table approach.- TRAFODION-2320: Make subquery unnesting work with common subexpressions.Generated Plans---------------The resulting query plan for a query Q with n commonsubexpressions CSE1 ... CSEn looks like this: Root | MapValueIds | BlockedUnion / \ Union Q / \ ... CTn / Union / \ CT1 CT2Each of the CTi variables looks like the following, anINSERT OVERWRITE TABLE tempi ... BlockedUnion / \ Truncate FastExtract TEMPi TEMPi | CSEiThe original query Q has the common subexpressions replacedwith the following: MapValueIds | scan TEMPiHere is a simple query and its explain:prepare s fromwith cse1 as (select d_date_sk, d_date, d_year, d_dow, d_moy from date_dim)select x.d_year, y.d_datefrom cse1 x join cse1 y on x.d_date_sk = y.d_date_skwhere x.d_moy = 3;>>explain options 'f' s;LC RC OP OPERATOR OPT DESCRIPTION CARD---- ---- ---- -------------------- -------- -------------------- ---------11 . 12 root 1.46E+0055 10 11 blocked_union 1.46E+0057 9 10 merge_join 7.30E+0048 . 9 sort 1.00E+002. . 8 hive_scan CSE_TEMP_CSE1_MXID11 1.00E+0026 . 7 sort 5.00E+001. . 6 hive_scan CSE_TEMP_CSE1_MXID11 5.00E+0011 4 5 blocked_union 7.30E+0042 . 4 hive_insert CSE_TEMP_CSE1_MXID11 7.30E+004. . 2 hive_scan DATE_DIM 7.30E+004. . 1 hive_truncate 1.00E+000--- SQL operation complete.>>CQDs to control common subexpressions-------------------------------------CSE_FOR_WITH is the master switch.CQD Value Default Behavior--------------------- --------- ------- ---------------------------------------CSE_FOR_WITH OFF Y No change ON Insert a CommonSubExprRef node in the tree whenever we reference a CTE (table defined in a WITH clause)CSE_USE_TEMP OFF Y Disable creation of temp tables for common subexpressions SYSTEM Same as OFF for now ON Always create a temp table for common subexpressions where possibleCSE_DEBUG_WARNINGS OFF Y No change ON Emit diagnostic warnings that show why we didn't create temp tables for common subexpressionsCSE_CLEANUP_HIVE_TABLES OFF Y No change ON Cleanup Hive tables used for CSEs with the "cleanup obsolete volatile tables" commandCommonSubExprRef relational operators-------------------------------------This is a new RelExpr class that is introduced. It marks the commonsubexpressions in a RelExpr tree. This operator remembers the name ofa common subexpression (e.g. the name used in the WITH clause).Multiple such operators can reference to the same name. Each ofthese references has a copy of the tree.Right now, these operators are created in the parser when we expand aCTE (Common Table Expression), declared in a WITH clause. If the CTEis referenced only once, then the CommonSubExprRef operator is removedin the binder - it also doesn't live up to its name in this case.The remaining CommonSubExprRef operators keep track of various changesto their child trees, during the binder and normalizer phases. Inparticular, it tracks which predicates are pushed down into the childtree and which outputs are eliminated.The CmpStatement object keeps a global list of all theCommonSubExprRef operators in a statement, so the individual operatorshave a way to communicate with their siblings:- A statement can have zero or more named common subexpressions.- Each reference to a common subexpression is marked in the RelExpr tree with a CommonSubExprRef node.- In the binder and normalizer, common subexpressions are expanded, meaning that multiple copies of them exist, one copy per CommonSubExprRef.- Common subexpressions can reference other common subexpressions, so they, together with the main query, for a DAG (directed acyclic graph) of dependencies.- Note that CTEs declared in a WITH clause but not referenced are ignored and are not part of the query tree.In the semantic query optimization phase (SQO), the current code makesa heuristic decision what to do with common subexpressions - toevaluate them multiple times (expand) or to create a temporary tableonce and read that table multiple times.If we decide to expand, the action is simple: Remove theCommonSubExprRef operator from the tree and replace it with its child.If we decide to create a temp table, things become much more difficult.We need to do several steps:- Pick one of the child trees of the CommonSubExprRefs as the one to materialize.- Undo any normalization steps that aren't compatible with the other CommonSubExprRefs. That means pulling out predicates that are not common among the references and adding back outputs that are required by other references. If that process fails, we fall back and expand the expressions.- Create a temp table tmp.- Prepare an INSERT OVERWRITE TABLE tmp SELECT * FROM cse tree that materializes the common subexpression in a table.- Replace the CommonSubExprRef nodes with scans of the temp table.- Hook up the insert query tree with the main query, such that it is executed before the main query starts.Temporary tables----------------At this time, temporary tables are created as Hive tables, with afabricated, unique name, including the session id, a unique statementnumber within the session, and a unique identifier of the commonsubexpression within the statement. The temporary table is created atcompile time. The query plan contains an operator to truncate thetable before populating it. The "temporary" Hive table is dropped whenthe executor TCB is deallocated.Several issues are remaining with this approach:- If the process exits before executing and deallocating the statement, the Hive table is not cleaned up. Solution (TBD): Clean up these tables like we clean up left-over volatile tables. Both are identified by the session id.- If the executor runs into memory pressure and deallocates the TCB, then allocates it again at a later time, the temp table is no longer there. Solution (TBD): Use AQR to recompile the query and create a new table.- Query cache: If we cache a query, multiple queries may end up with the same temporary table. This works ok as long as these queries are executed serially, but it fails if both queries are executed at the same time (e.g. open two cursors and fetch from both, alternating). Solution (TBD): Add a CQD that disables caching queries with temp tables for common subexpressions.In the future we also want to support volatile tables. However, those alsohave issues:- Volatile tables aren't cleaned up until the session ends. If we run many statements with common subexpressions, that is undesirable. So, we have a similar cleanup problem as with Hive tables.- Volatile tables take a relatively long time to create.- Insert and scan rates on volatile Trafodion tables are slower than those on Hive tables.To-do items are marked with "Todo: CSE: " in the code.