Fix for bug 1442932 and bug 1442966, encoding for varchar Submitting this before finishing regressions on workstation, in the interest of time.
Key encodings for VARCHAR values used to put a varchar length indicator in front of the encoded value. The value was the max. length of the varchar and the indicator was 2 or 4 bytes long, depending on the length of the indicator in the source field. That length used to depend only on the max number of bytes in the field, for >32767 bytes we would use a 4 byte VC length indicator.
Now, with the introduction of long rows, the varchar indicator length for varchars in aligned rows is always 4 bytes, regardless of the character length. This causes a problem for the key encoding.
We could have computed the encoded VC indicator length from the field length. Anoop suggested a better solution, not to include the VC indicator at all, since that is unnecessary. Note that for HBase row keys stored on disk, we already remove the VC indicator by converting such keys from varchar to fixed char. Therefore, the issue happens only for encoding needed in a query, for example when sorting or in a merge join or union.
Description of the fix:
1. Change CompEncode::synthType not to include the VC length indicator in the encoded buffer. This change also includes some minor code clean-up.
2. Change the assert in CompEncode::codeGen not to include the VC indicator length anymore.
3. Changes in ex_function_encode::encodeKeyValue(): a) Read 2 and 4 byte VC length indicators for VARCHAR/NVARCHAR. b) Small code cleanup, don't copy buffer for case-insensitive encode, since that is not necessary. c) Don't write max length as VC length indicator into target and adjust target offsets accordingly (for VARCHAR/NVARCHAR).
4. Other changes in sql/exp/exp_function.cpp: d) Handle 2 and 4 byte VC len indicators in hash function and Hive hash function (problems unrelated to LP bugs fixed). e) Add some asserts for cases where we assume VC length indicator is a 2 byte integer.
CompDecode is not yet changed. Filed bug 1444134 to do that for the next release, since that change is less urgent.
Patch set 2: Copyright notice changes only. Patch set 3: Updated expected regression test file that prints out encoded key in hex.
LOB datatype infrastructure support Technology preview. More changes expected as part of this work before it is user ready. blueprint lob-support
This checkin contains basic support for create blob/clob datatypes. The feature is disabled by default. Instructions on how to enable are listed below. New test executor/TEST130 that turns the feature on and tests out the functionality. New mxlobsrvr process will be started as part of sqstart. Create and drop of tables with LOB columns. No support for alter. DML support for LOB datatypes. Insert, update and deletes. Joins of 2 tables with LOB columns are allowed but joins on LOB columns temselves are not allowed. Insert-select from one LOB table another not yet supported. Link to document from the blueprint will be added shortly.
To enable and try LOBs: On a developer workstation :
cqd TRAF_BLOB_AS_VARCHAR 'OFF';
On a cluster after installing the code 2 steps are needed:
This checkin includes several merges from the mainline and each of the lines below represents one commit to the project branch where this work was done. ` -Turning off LOB code by default. But turning it on in executor/TEST130 just to ensure testing the code path. -Support for showddl and some syntax for external files and stream. -LOB regression test -Workaround for dtm issue LP 378167 -Changes to make append work. Changes to use lob heap. -Fix for using system heap for all LOB allocations and handling NULLs. -Added workaround for cursor delete issue. LP 1376969 -Fixes for update. -Parser changes for exe_util_lob_extract -Pull in lob extract code -Adding mxlobsrvr directory
-Fixed the LOB interface to use 2 new params for cursor fetches. They do not overload the LOB Handle and LOB handle length anymore. Added a flag to lobGlobals to indetify it's a hive access. Cleaned up parser code.
-LOB support for create,drop,insert,delete,select.