ExpLOBaccess.cpp

Clone Tools
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
Fixes from review to sqvers

commit 04f3812f112a5629a563f02d7e72c5fa503c6a8d

Author: Sandhya Sundaresan <sandhya.sundaresan@hp.com>

Date: Sun Jun 14 04:23:21 2015 +0000

Preliminary checkin of lob support for external files. Inserts from http

files, hdfs files and lob local files are supported. Added support for

new extract synttax. Extract from lob

columns to hdfs files has been added . More work needed to support

binary files and very large files . Current limit is 1G.

Also fixed some error handling issues

Fixed some substring warning issues in the lobtostring/stringtolob

functions.

Added references and interfaces to curl library that is needed to read external http

files.

More work needed before this support can be used

Change-Id: Ieacaa3e4b7fa2a040764888c90ef0b029f107a8b

Change-Id: Ife3caf13041f8106a999d06808b69e5b6a348a6b

  1. … 29 more files in changeset.
Move core into subdir to combine repos

  1. … 10768 more files in changeset.
Move core into subdir to combine repos

  1. … 10622 more files in changeset.
Move core into subdir to combine repos

Use: git log --follow -- <file>

to view file history thru renames.

  1. … 10837 more files in changeset.
Add new tmlib callback to propagate startid.

Change-Id: I42dac4838228c14d445bed2692dceaae2586a36e

  1. … 24 more files in changeset.
Patch for logging udf , udr log and hive failures

1. Fixed the event log reader udf to read all log files created by

monitor, ssmp,sscp and mxlobsrvr,udr

2. Added a new log file and config file for udrsever . It will be

created when udrserver starts up.

3. Fix for current hive failures.Destructor for ExLob was calling

hdfsDisconnect. This had to be commented out since it caused

closeChannel errors and cores during hive aaccess. LP 1433882 created for a potential leak.

Change-Id: I0613b352248a4d796604346f1a495f7606d21d4c

  1. … 5 more files in changeset.
Patch for comments from Dave B's review.

Fixed copyrights and some error handling

This change set includes changes to log4cpp infrastructure to split the

single config file into separate ones for each component.

Changed the existing tm.log file so the numerous 0 byte files are not

created (LP 1409226)

Node numbers are added to smp, sscp and lobserver log files which are

unique to each node.

Currently the following are the log files(examples) that will be seen in the sqf/logs

directory :

master_exec_0_30350.log

mxlobsrvr_0.log

mxlobsrvr_1.log

sscp_0.log

sscp_1.log

ssmp_0.log

ssmp_1.log

startup.log

tm_0.log

tm_1.log

trafodion.dtm.log

trafodion.hdfs.log

This change set includes one change to cleanup lob files during

initialize trafodion,drop.

commit 6b48cd7abe6cce7c9a1d722a6a67f151a428fbf8

Author: Sandhya Sundaresan <sandhya.sundaresan@hp.com>

Date: Wed Mar 11 18:20:16 2015 +0000

Change-Id: I8d06716a1cac464454e01e5267460efa1859747f

  1. … 24 more files in changeset.
Fix regression from 1225 - abend from hive scan

This change fixes a problem merge in Mar 4 which will

cause an abend and corefile when scanning hive tables.

Change-Id: I8e115d85e1dbedb11e3f286fe4735301feb3c21b

Cleanup hive reader thread when canceling scan

Cancel processing, whether triggered by [FIRST N], error

handling, SQL cancel, or any other reason, can cause the

main executor thread to abruptly stop interacting with

the reader threads. This change fixes a hang caused by

a reader thread waiting for the main thread to give it

an empty buffer, after the main thread has finished the

canceled query.

Change-Id: Ib41a8a0036b7aab8dedf7d10ee55eb2007f7c265

Closes-Bug: #1425661

  1. … 4 more files in changeset.
Fix for bug 1329361 and minor improvements in hive scan

This is a joint checkin with Sandhya Sundaresan.

PreOpens during hive scan were not working correctly, cause the same range

to be read twice. The read triggered by the preopen was not used since it had

incorrect cursor name. This is now fixed.

Reduced number of LOB threads to 2

Removed multi cursor code since it is not used

Skip reading the first range if it has 0 bytes to be read.

Change-Id: I91cff41134490435165da7d59c955c7215b3c6b8

  1. … 6 more files in changeset.
Fix for hang encountered during hive tests.

Added proper copyright.

Added comment to make it clear to readers of the code.

Hive tests have been hanging for quite a while on the official slave

machines.

The cause for this particular hang was that the main thread was

destroying the cursor (ExLobCursor) object . The worker thread was kind

of slow and it continues to access the cursor after the main thread had

destroyed it. pthread calls exhibit undefined behavior when

uninitialized mutex or condition variables are accessed. Added code to

prevent this timing issue in the worker threads.

Added trace utlitlty to diagnose hangs and execution issues.

The trace messages look like this and get logged into trace file on the

local directory named trace_threads.<pid>. It is controlled by an

environment variable TRACE_HDFS_THREAD_ACTIONS.The envvar is checked

only ones and the file handle is checked each time the trace needs to be

done. If file handle is NULL, no message is logged.

Change-Id: I95519b61d71339c719e37500bccae111ce070a15

  1. … 1 more file in changeset.
LOB datatype infrastructure support

Technology preview.

More changes expected as part of this work before it

is user ready.

blueprint lob-support

This checkin contains basic support for create blob/clob datatypes.

The feature is disabled by default. Instructions on how to enable are

listed below.

New test executor/TEST130 that turns the feature on and tests out the

functionality.

New mxlobsrvr process will be started as part of sqstart.

Create and drop of tables with LOB columns.

No support for alter.

DML support for LOB datatypes.

Insert, update and deletes. Joins of 2 tables with LOB columns are

allowed but joins on LOB columns temselves are not allowed.

Insert-select from one LOB table another not yet supported.

Link to document from the blueprint will be added shortly.

To enable and try LOBs:

On a developer workstation :

cqd TRAF_BLOB_AS_VARCHAR 'OFF';

On a cluster after installing the code 2 steps are needed:

1. cqd TRAF_BLOB_AS_VARCHAR 'OFF';

2. sudo su hdfs --command "hadoop fs -mkdir /lobs"

sudo su hdfs --command "hadoop fs -chown -R trafodion:trafodion

/lobs"

This checkin includes several merges from the mainline and each of

the lines below represents one commit to the project branch where this

work was done. `

-Turning off LOB code by default. But turning it on in

executor/TEST130 just to ensure testing the code path.

-Support for showddl and some syntax for external files and stream.

-LOB regression test

-Workaround for dtm issue LP 378167

-Changes to make append work. Changes to use lob heap.

-Fix for using system heap for all LOB allocations and handling NULLs.

-Added workaround for cursor delete issue. LP 1376969

-Fixes for update.

-Parser changes for exe_util_lob_extract

-Pull in lob extract code

-Adding mxlobsrvr directory

-Fixed the LOB interface to use 2 new params for cursor fetches. They do

not overload the LOB Handle and LOB handle length anymore.

Added a flag to lobGlobals to indetify it's a hive access.

Cleaned up parser code.

-LOB support for create,drop,insert,delete,select.

Change-Id: I7c8125696e847b71580b746388632e75741bd347

  1. … 52 more files in changeset.
Bug 1363162.Added a change #2 suggested after code review.

Change-Id: I8acf9b5598cc8f7ac3dbcdec59b7814bc77c680e

Fix SIGSEGV in ExLobsOper

This change fixes a problem in using a pointer after it has

been deleted.

Closes-Bug: #1362612

Change-Id: I9e24d06d351c865fb6ecca63280ef752a9d355c8

Adding support to use Hadoop 1.0 in a build environment

Steve Varnau found this issue when setting up a Hortonworks test machine.

HDP 1.3 uses Hadoop 1 interfaces and the libhdfs hdfsDelete() method

has only two parameters in Hadoop 1 while it has three parameters in

Hadoop 2.

Adding an environment variable to be able to use conditional compilation

for the calls to hdfsDelete().

In principle this means that we would need to produce two binaries, one

for Hadoop 1 and a second one for Hadoop 2, but I suspect that the same

binary will work ok for both. That's for two reasons: First, the value

of the extra parameter is irrelevant for our situation and, second, I

don't know when we actually delete an HDFS directory - maybe when we

do an INSERT OVERWRITE TABLE?

Change-Id: Ib5d17406a26c98abe39a2130a0cf852e8c4347bb

  1. … 4 more files in changeset.
Fixes for a couple of hive test hangs that show up on slower VMs

1. Move the postfetchBufList freeing code AFTER worker threads are gone.

That way we are sure no one is using these buffers nor needs them. This

would avoid the case where a very slow worker thread is blocked waiting

for a free buffer while the main thread is not consuming .

2.Fix for each worker thread to post a wakeup before going away. This

change is to avoid the "lost wakeup" problem where one thread was busy

and never got a chnace to get woken up.

This has ben pre-reviewed by Suresh.

All dev regressions were run.

Hive tests were run on the slower VM.

Change-Id: I513f1c7d62c1f8227a176a4f90c9b22083696e9e

Initial code drop of Trafodion

    • -0
    • +2454
    ./ExpLOBaccess.cpp
  1. … 4886 more files in changeset.