Normalizer interface for TMUDFs, blueprint cmp-tmudf-compile-time-interface Added new compiler interfaces:
describeParamsAndColumns has been extended to allow updating PARTITION BY and ORDER BY clauses specified for input tables.
describeDataFlowAndPredicates() allows the TMUDF to eliminate columns not needed by the query and to push predicates into the TMUDF or to the children.
describeConstraints() allows the TMUDF to see cardinality and uniqueness constraints of the table-valued inputs (children) and to synthesize cardinality and uniqueness constraints on the TMUDF result.
TMUDFs now have 3 function types: GENERIC - makes most conservative assumptions in the compiler MAPPER - assumes TMUDF carries no state between input rows REDUCER - assumes TMUDF carries no state between input partitions defined by PARTITION BY clause
Query id and user id are now available to the UDR.
Added doxygen documentation for the C++ UDR interface. The resulting web page will be published on the wiki. To generate the documentation yourself, do the following:
cd $MY_SQROOT/../sql/sqludr doxygen doxygen_tmudr.1.6.config # now open tmudr_1.0/html/index.html in a web browser
Snapshot Scan changes The changes in this delivery include: -decoupling the snapshot scan from the bulk unload feature. Setup of the temporary space and folders before running the query and cleanup afterwards used to be done by the bulk unload operator because snapshot scan was specific to bulk unload. In order the make snapshot scan indepenednt from bulk unload and use it in any query the setup and cleanup tasks are now done by the query itself at run time (the scan and root operators). -caching of the snapshot information in NATable to optimize compilation time Rework for chaching: when the user sets TRAF_TABLE_SNAPSHOT_SCAN to LATEST we flush the metadata and then we set the caching back to on so that metadata get cached again. If newer snapshots are created after setting the cqd they won't be seen if they are already cached unless the user issue a command/cqd to invalidate or flush the cache. One way for doing that can be to issue "cqd TRAF_TABLE_SNAPSHOT_SCAN 'latest';" again -code cleanup
below is a description of the CQds used with snapshot scan:
TRAF_TABLE_SNAPSHOT_SCAN this CQD can be set to : NONE--> (default)Snapshot scan is disabled and regular scan is used , SUFFIX --> Snapshot scan is enabled for the bulk unload (bulk unload behavior is not changed) LATEST --> Snapshot Scan is enabled independently from bulk unload and the latest snapshot is used if it exists. If no snapshot exists the regular scan is used. For this phase of the project the user needs to create the snapshots using hbase shell or other tools. And in the next phase of the project new comands to create, delete and manage snapshots will be add. TRAF_TABLE_SNAPSHOT_SCAN_SNAP_SUFFIX This CQD is used with bulk unload and its value is used to build the snapshot name as the table name followed by the suffix string TRAF_TABLE_SNAPSHOT_SCAN_TABLE_SIZE_THRESHOLD When the estimated table size is below the threshold (in MBs) defined by this CQD the regular scan is used instead of snapshot scan. This CQD does not apply to bulk unload which maintains the old behavior TRAF_TABLE_SNAPSHOT_SCAN_TIMEOUT The timeout beyond which we give up trying to create the snapshot scanner TRAF_TABLE_SNAPSHOT_SCAN_TMP_LOCATION Location for temporary links and files produced by snapshot scan