Costing and statistics compiler interfaces for UDFs blueprint cmp-tmudf-compile-time-interface bug 1433192
This change adds compiler interfaces for UDFs that give information about statistics of the result table and also a cost estimate. It also has more code for the upcoming Java UDF feature, retrieving updated invocation infos and returning them back to the executor/compiler C++ code.
Description of the changes in more detail:
- Addressed remaining review comments from my last checkin, https://review.trafodion.org/1655 - Make sure that user-generated exceptions during deallocation of a routine are reported. These happens in the destructor of the object derived from tmudr::UDR. For Java, we may need a deallocate method. - Java and JNI code to serialize the updated UDRInvocationInfo and UDRPlanInfo object after calling the user code and return them back through the JNI interface to the calling C++ code. - The cost method source files had some inline methods defined in the .cpp file and used an include file that included other .cpp files. Make didn't pick up changes made in these files. Removed this code and changed it to regular methods and inlines. - Replaced some Context * parameters in costing with PlanWorkSpace *, to be able to get to UDF-related info that's stored in a special PlanWorkSpace. - Changed the behavior or isBigMemoryOperator() for TMUDFs. If the UDF writer specifies the DoP for the UDF invocation, then consider it a BMO. - If possible, synthesize the HASH2 partitioning function of a TMUDF's child as the partitioning function of the UDF. This can be done if the partitioning key gets passed through the UDF. - Statistics interface for TMUDFs: - TMUDF now populates statistics field in the UDRInvocationInfo object and calls the describeStatistics() method. - Added an estimated # of partitions for partitioned input tables of TMUDFs. Also changed row count methods to "estimated" row count. - Added code to incorporate the information on row count and UEC provided by the UDF writer into statistics of the TMUDF. This code is not that suitable for coding it as the default implementation of describeStatistics(). Therefore, the default implementation of describeStatistics() does nothing, but the compiler applies some heuristics in case the UDF writer provides no statistics. - Changed cost method for TMUDFs to incorporate an estimated cost per row from the UDF writer. There is no special compiler interface call to ask for the cost, it can be set from the describeDesiredDegreeOfParallelism() call and, once supported, from the describePlanProperties() call. Note that we don't have immediate plans to support describePlanProperties(), that might come after 2.0.
Patch Set 3: Addressed Dave's review comments. Patch Set 4: Fixed misplaced copyright in expected file.