barry fritchman <> in Trafodion

Avoid scanner timeout for Update Statistics

For performance reasons, Update Stats pushes sampling down into HBase,

using a filter that returns only randomly selected rows. When the

sampling rate is very low, as is the case when the default sampling

protocol (which includes a sample limit of a million rows) is used on

a very large table, a long time can be taken in the region server

before returning to Trafodion, with the resultant risk of an

OutOfOrderScannerNextException. To avoid these timeouts, this fix

reduces the scanner cache size (the number of rows accumulated before

returning) used by a given scan based on the sampling rate. If an

adequate return time can not be achieved in this manner without

going below the scanner cache minimum prescribed by the

HBASE_NUM_CACHE_ROWS_MIN cqd, then the scanner cache reduction is

complemented by a modification of the sampling rate used in HBase.

The sampling rate used in HBase is increased, but the overall rate

is maintained by doing supplementary sampling of the returned rows in

Trafodion. For example, if the original sampling rate is .000001,

and reducing the scanner cache to the minimum still results in an

excessive average time spent in the region server, the sampling

may be split into a .00001 rate in HBase and a .01 rate in Trafodion,

resulting in the same effective .000001 overall rate.

Change-Id: Id05ab5063c2c119c21b5c6c002ba9554501bb4e1

Closes-Bug: #1391271

Eliminate manual steps in load/ustat integration

The fix achieves full integration of the bulk load utility with

Update Statistics. The Hive backing sample table is now creeated

automatically (formerly, we only wrote the HDFS files to be

used by the Hive external table), the correct sampling percentage

for the sample table is calculated, and the ustat command is

launched fro1m the executor as one of the steps in execution of

the bulk load utility.

Change-Id: I9d5600c65f0752cbc7386b1c78cd10a091903015

Closes-Bug: #1436939

  1. … 12 more files in changeset.
New ustat algorithm and bulk load integration

This is the initial phase of the Update Statistics change to use

counting Bloom filters and a Hive backing sample table created

during bulk load and amended whenever a new HFile for a table

is created. All changes are currently disabled pending some

needed fixes and testing.

blueprint ustat-bulk-load

Change-Id: I32af5ce110b0f6359daa5d49a3b787ab518295fa

    • -0
    • +225
    • -0
    • +297
Misleading error when Hivestats can't be created

When Update Statistics is performed on a Hive table, the

trafodion.hivestats schema is created if it doesn't already

exist. If the user issuing the ustat statement does not

have the create_schema component privilege, this will fail, but

the code as it existed ignored the error, assuming it was

a "schema already exists" error, which is expected in most

cases. If the schema does not exist, an error is then given

when an attempt is made to create the histograms table in

the non-existing schema.

This has been changed to check the error when the schema

creation attempt fails, so the correct error is generated.

This is a partial fix, because a user with manage_statistics

privilege should not be prevented from executing Update

Statistics just because the trafodion.hivestats schema did

not already exist. The likely full solution would be to

create the schema automatically at Trafodion initialization.

Change-Id: I26561100bab9f6a59f3f5d0d56ce9d2aab380fa3

Partial-Bug: #1425748

Errors reported for Update Stats on Hive tables

Attempts to execute Update Statistics statements on Hive

tables fail with error 1002 (catalog does not exist). The

catalog used to store Hive stats was no longer being

successfully created on demand. The Trafodion catalog is

now used instead, with a schema for Hive stats created

in it on first use. The tables for histograms and

histogram intervals for Hive tables are created in that

schema on first use.

Change-Id: Ib57d0af4a3da6f52f0544d6c2fce3e77d2e823c1

Closes-Bug: #1320397

Don't exclude _SALT_ from requested histograms

The optimizer sets the referenced property for non-user columns

to NOT_REFERENCED, which may override a prior value indicating

that it is referenced for histograms. This prevents Update

Stats automation from creating an empty histogram for it to be

replaced later with an actual histogram when Update Statistics

... On Necessary Columns is executed. The _SALT_ column is part

of the primary key, however, and it is useful to have a histogram

for it. The code was changed to avoid overwriting the referenced

field for the salt column so that its histogram might be used.

It is excluded, however, from the list of warnings for histograms

that are requested but not present, as it is not a column explicitly

created by the user.

Change-Id: If13485a3887484e83fec26e9422050024cad5e62

Closes-Bug: #1411472

Ustat fails with duplicate CLI statement name

Under certain circumstances, processing of the ON NECESSARY

COLUMNS clause will occur while another dynamic cursor is

still open, causing the default statement name used by

HSCursor to be duplicated. When instantiating HSCursor for

NECESSARY, a unique statement name is now supplied to the


Another unrelated change is included; the access clause for

the query to retrieve necessary columns now specifies FOR


is a response to a code review comment for the delivery of

ustat automation that was not included in that delivery

because of Git problems.

Change-Id: I574b62d0d665fd4b6fd79fd0c6184050bef82d63

Turn off ustat row count estimation by default

The initial execution of Update Statistics after loading a

table sometimes gets inaccurate HBase row count estimates.

Until the cause of this variance is determined and fixed,

the cqd controlling use of estimation has been set to OFF

by default.

Change-Id: I698f5ac8407495f890a242c82a1946a5506ceebf

Partial-Bug: #1402031

Automated collection of necessary statistics

When the optimizer requests a histogram for a given column, and that

histogram does not exist, it may (depending on the cqds in effect) register

a request for the histogram to be created at a later time, or utilize a

small sample to generate a rudimentary histogram on the fly. In either

case, when a subsequent Update Statistics statements specifies the ON

NECESSARY COLUMNS clause, any column of the target table that has been

the subject of one of these actions will have a bona fide histogram


blueprint ustat-automation

Change-Id: Ieceac3e1d84bf8091a5bf340dc1739a447ad2436

Lookup object UID for Sequence generator

The NATable object representing a Sequence generator sometimes has

an object UID of 0 at the time privileges are checked, resulting

in failure to locate the object and denial of access. This change

looks up the UID if the corresponding NATable member variable is 0.

Change-Id: I889d2d823b8c0faca5318572250b45e07134b6cf

Change default for USTAT_USE_IS_WHEN_NO_STATS

This cqd was originally set to OFF by default, causing

internal sort not to be used in cases where there are

no existing histograms. Internal sort is known to

underperform when there are a small number of unique

values for a column, and without prior histograms, the

unique entry count is unknown.

However, experience with POCs has shown that using internal

sort in these situations (unique count unknown) is in

general significantly faster than not using it, so the

default setting has been changed to ON.

Running developer regressions with this setting exposed a

bug that occurs when internal sort is used on an empty table

that previously did contain rows. This change includes a known

diffs file for the failing test until the bug (LP 1393930)

can be fixed.

Change-Id: I9ff6672a168016fb83fadb26b40a30c7fa3eee6c

Closes-Bug: #1392555

    • -508
    • +4
Change HSTableDef to recognize volatile tables

The redefinition of the isVolatile() virtual function for

some subclasses of HSTableDef originally returned FALSE, and

had not been updated since the addition of volatile tables.

Now the function always determines whether the table is volatile

based on the name of the containing schema.

The symptom of the bug was a core file produced for certain

operations on a virtual table, such as bulk load.

Change-Id: I1d592935d4922e8ce8202bf7429f929275ebae0f

Closes-Bug: #1388207

Use HBase estimated row count if no stats exist

In the absence of statistics, the row count for an HBase table

is estimated using information in its HFiles. Previouly, this

estimate was calculated and assigned to the NATable's

originalCardinality_ member variable, but this did not result

in it being used by the optimizer (the correct row count did

not appear in the results of an Explain). The code to calculate

the estimate is now invoked from HSHbaseTableDef::getRowCount(),

which has the effect of it being used as intended.

Change-Id: I3794e8368082b6f17e47fe686ec9a004aed27976

Closes-Bug: #1388918

Ensure HFile.Reader objects are closed

Performance workloads are seeing mxosrvrs with a large accumulation

of open sockets, and a resultant decrease in performance. The cause

was HFile.Reader objects used in estimating a table's row count not

being closed. This fix executes a close() on the Reader in a finally

clause, such that it will always be executed even if an exception

is thrown.

Even with this fix, it was noted that early phases of workloads showed

a significant increase in compile times when row count estimation was

enabled. To address this, the estimated counts were saved so that they

would not be repeated as long as the NATable remained in cache. Also,

estimation was avoided completely for metadata tables. With these

changes, my own tests showed a very low overhead for estimation, on

the order of 60-70ms per user table in the workload, regardless of

query complexity or how many queries referenced the table. However,

the results of performance benchmarks executed on clusters did not

improve, and so this change also includes turning off the cqd that enables

row count estimation for the optimizer (a separate cqd controls use

of the feature for Update Statistics, and it is still on by default).

Future work will attempt to address the performance impact of turning

on the cqd.

Change-Id: Ied0369c8def5062d69766198155f8e309bae1ff8

Make corrections to HFile path

To estimate the row count for an HBase table, the HFiles are

accessed directly from the HDFS file system. When the Trafodion

name of the table includes a delimited id, the quotes must not

be included in the node of the HDFS path representing the

qualified table name. In addition, the hbase.rootdir property

in hbase-site.xml may consist solely of a path rather than a

full URL. It was previously assumed that a full URL would be

present, and the value of the property was used to construct

a object. When a string consisting of only a file

path is passed to the URI constructor, a NullPointerException

is thrown (instead of the expected URISyntaxException), causing

the path we construct to the HFile to be incorrect. The code

was changed to utilize either a file path or a valid URI as the

value of the property.

Change-Id: If35d9da7aaab815a9c1d550bc505d86f0cbcf611

Closes-Bug: 1384959

Eliminate minor estimated nullcount inaccuracies

When the relative frequency of null values is estimated via

sampling when getting an estimated row count for an HBase

table, there is an (unlikely) situation in which the null

count could be thrown off. If the primary key consists of a

single column, and some row has all null values except for

the primary key, those nulls will be counted incorrectly.

This was caused by comparing two successive KeyValue positions

using < rather than <=.

In addition, if the end of the HFile is reached while taking

the sample, any nulls at the end of the last row will not be

counted, and this has been fixed as well.

o Closes-Bug: #1383835

Change-Id: Ia449d1379d851e8df0f7811e835b5730851c33e2

Use new HFIle location for HBase 0.98

The code to lookup the HFiles for a given table was changed

to accommodate their new location within the HDFS file system.

The location seems to have changed at some point between 0.94

and 0.98, although I could find no documentation of the change,

nor any HBase property that could be used to identify the

path to the filesystem node under which HFiles for tables

are stored. This change affects lookup of HFiles for the purpose

of estimating the number of rows in a table.

Change-Id: I17b892144c03a2e9cebd3de22838177ffb6a324d

Closes-Bug: 1379862

Eliminate heap corruption error in arkcmp

Allocation from heap in ustat/hs_globals.cpp used () instead

[] for char array, which has been corrected.

Change-Id: I630b98b5b01deb28d03931d9c3746d3d3d7741c0

Closes-Bug: 1377260

Estmate NHase row count when stats not available

When statistics are not available for a table, give the optimizer

a better estimate of its cardinaly than the default value by

reading summary information from the trailer block of the table's


blueprint estimate-rowcount

Change-Id: Ie4a44bc7c87385c551eef96478147117b151ab9b

Ensure UID returned for MD tables is non-null

When an NATable is created for a metadata table, the UID is not known.

Under certain conditions, this creates a problem whereby the AUTHS

table can not be read, and access to a table is incorrectly denied.

In this fix, the NATable::objectUid() function is changed to do a

lookup of the UID for a metadata table if it is 0, and to store the

result in NATable for future calls.

Change-Id: I5eb58871070e585a7242f92becde7e6a24a8f5f5

Closes-Bug: #1370289

Provide quick row count estimation for Ustat

Update Statistics needs an estimation of the cardinality of an HBase

table, which to this point has been provided by the result of selecting

count(*) from the table with an internal query. This incurred a

significant overhead for large files, and also occasionally resulted in

an 8448 error due to a known coprocessor problem. The approach

implemented by this fix is to access the HFiles through the FileSystem

interface and read the EntryCount field in the trailer block of each

file. Some sampling of initial data blocks is done to determine the

expected number of missing KevValues due to nulls and the number of

non-PUT KeyValues. The number of rows is estimated by dividing the

adjusted count by the number of columns in the table. The MemStore of

each of the table's regions is checked to get the total storage for

the table outside of HFiles, and the number of rows in memory is

estimated using the total MemStore size and the size-to-rowcount ratio

for the HFiles.

Change-Id: I7435ec3c765992084947b9dc7f8540c779f1f5d3

Closes-Bug: #1321857

Avoid pushdown of sampling to HBase if oversampling is used

In general, when a Sample clause is present that specifies random

sampling, the sampling is pushed down to the HBase layer and performed

using a filter. When oversampling (sampling rate > 100%) is used, this

can not be done because the filter used is incapable of returning

multiple copies of a given row. The result would be to return all

selected rows once, instead of a greater number of rows that is

consistent with the oversampling.

This change opts out of sampling pushdown if the sample percentage is

greater than 100%, performing the sampling instead with a Sample

operator within Trafodion.

Note that the function RelSample::isSimpleRandomRelative() formerly

contained a comment stating that it returned FALSE if oversampling was

used, although this was not the case. Rather than adding that check to

the function, the comment was removed and the check for oversampling

was added to RelSample::bindNode() as a separate pushdown check.

isSimpleRandomRelative() is also used as a condition for parallel

execution, which should not be ruled out simply because random

oversampling is used.

Closes-Bug #1340432

Change-Id: I6894be2735cf3f7e5a5838bfcab4e1055a7274cd

Fix histograms for primary key of salted tables

Users attempting to use Update Statistics to create a multi-column

histogram (MC) corresponding to the primary key of a salted table may be

unaware that the "_SALT_" column is implicitly prepended to the key as

stated in the Create Table statement, and omit it. This fix will cause

Update Stats to detect a request for a multi-column histograms that

specifies the primary key columns (or a prefix of the full key), and

add _SALT_ to it if missing, and order the MC to match the order of the

columns in the primary key.

The change only affects salted tables, and is only applied if neither

the ON EVERY KEY nor ON EVERY COLUMN clauses is present, because an MC

matching the full primary key is automatically generated in those cases.

A second part of this fix applies to cases where ON EVERY KEY or ON

EVERY COLUMN is specified in an Update Statistics statement on a salted

table. By default, MCs corresponding to subsets of the primary key will

no longer be generated automatically in this case. The cqd

USTAT_ADD_SALTED_KEY_PREFIXES_FOR_MC may be set to 'ON' to cause MCs

for subsets of the primary key to be generated.

Closes-Bug #1336983

Change-Id: I930fef13371d5d773a0df44601cf99a8a4dc8322

Update Statistics performance improved by sampling in HBase

Update Statistics is much slower on HBase tables than it was for

Seaquest. A recent performance analysis revealed that much of the

deficit is due to the time spent retrieving the data from HBase that

is used to derive the histograms. Typically, Update Statistics uses a

1% random sample of a table’s rows for this purpose. All rows of the

table are retrieved from HBase, and the random selection of which rows

to use is done in Trafodion.

To reduce the number of rows flowing from Hbase to Trafodion for queries

using a SAMPLE clause that specifies random sampling, the sampling logic

was pushed into the HBase layer using a RandomRowFilter, one of the

built-in filtersrovided by HBase. In the typical case of a 1% sample,

this reduces the number of rows passed from HBase to Trafodion by 99%.

After the fix was implemented, Update Stats on various tables was 2 to 4

times faster than before, when using a 1% random sample.

Change-Id: Icd40e4db1dde444dec76165c215596755afae96c

Fix bug 1327395, failure to create key histograms

For an index on multiple columns, the "ON EVERY KEY" option of

Update Statistics was not creating the expected multicolumn

histograms. This was due to failure to use the correct loop

index in the loop that enumerated the component columns.

Change-Id: I2f24237fdadbb954e6239f44957fbe3c6626312c