fix bug 1343661(cleanup HBASE partitioning), bug 1347819 and bug 1343566 1343661: please refer to sql/sqlcomp/DefaultConstConstants.h for the definition of CQD HBASE_PARTITIONING. 1347819: the change is in ExExeUtilLoad.cpp to disable HASH2 for fast load. 1343566: method NADefaults::getTotalNumOfESPsInCluster() now returns the correct value if CQD PARALLEL_NUM_ESPS is set to an integer value. Rework 1 to address Dave's review comments. Rework 2 to address Khaled's review comments as follows. A Boolean flag (isTrafLoadPrep_) is added to class BinWA to better control the type of partitioning functions needed for the traf preparation step. When we are binding all nodes, the flag is set to TRUE which instructs createNAFileSet() not to create hash2. Rework 3 to address Hans's review comments. If force to have hash2 and the partitioning function in the cached table is range, do not return the cached object. Rework 4 to address seabase/TEST015 core, which is reported in bug 1349990. Bug 1349990 is fixed in this rework.
A. Added new options to the load the syntax with options:
Load [with option[,option,….]] into <table> select .. from <table>
Option can be: * Truncate Table: By default target table is not truncated before loading data. If truncate table option is specified the target table is truncated before loading. * No recovery: by default load handles recovery using using snapshots. If no recovery option is specified then snapshots are not used. This option was called no rollback before. * Log errors : (not implemented yet). Will be bsed to log error rows into a file. * No Populate indexes: by default indexes are handled by load. in this case indexes are disable before starting the load and populated afterwards If no populate indexes option is specified, indexes are not handled by load. * Constraints: (not implemented yet) will be used to handle constraints. * No duplicate check: by defaults an error is generated when duplicates are detected. if no Duplicate check option is specified then duplicates are ignored. * Stop after N errors: (not implemented yet) will be used to fail the load after N errors * No output: by default load command print status messages listing the steps that are being executed. If no output is specified then no status messages are displayed.
B. Supoport for duplicate detection at runtime. By defaults an error is generated when duplicates are detected. If no Duplicate check is specified then duplicates are ignored.
C. Added index handling to load By default, indexes are disabled before starting the load and populated afterwards.
D. changes in the optimizer to make sure the data is always sorted. when loading small data sets we noticed that optimizer does not always add a sort node.
E. Added status messages that gives the different steps that are taking place . to disable the status messages we need to specify “no output” option.
F. Added checks so that the user cannot specify an option more than once