Clone Tools
  • last updated 15 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
Improve Error Handling in Local Directory Feeds

This change improves handling of two error types for filesystem

based feeds. The first one is the handling of IO Errors which

causes the input stream to be closed, and the second one is

reacting to missed filesystem events. In both cases, we scan the

directory and compare it with the history we have in order to

resume from where we last left off.

In addition, this change includes some refactoring in external

data. Particularly, we get rid of the stream provider layer and

instead, stream factories create input streams directly. This

is consistent with record reader factories which create readers

directly without reader providers.

Change-Id: I08d89229e33c91532b1038ba9f7a372f7ca1fdb5

Reviewed-on: https://asterix-gerrit.ics.uci.edu/720

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <hubailmor@gmail.com>

  1. … 143 more files in changeset.
Support Change Feeds and Ingestion of Records with MetaData

This change allows feeds to perform upserts and deletes

in order to perform replication of an external data source.

The change does so by performing the following:

1. The adapter produces [PK][Record]. (Record == null --> delete)

2. The insert is replaced by an upsert operator.

Change-Id: If136a03d424970132dfb09f0dda56e160d4c0078

Reviewed-on: https://asterix-gerrit.ics.uci.edu/621

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 269 more files in changeset.
Add flush() to IFrameWriter

This method is expected to be used with feeds to push

frames all the way to storage when needed. As of now, it is

needed in two cases:

1. No activities in ingestion node and need to push content

so it can be stored.

2. When the ingestion node needs to move the checkpoint ahead

if the at least once semantics are used.

Two feeds make use of this function. The filesystem feed and

couchbase feed which was introduced as well in this change.

Change-Id: Id862ce9e9b1360864c6976f2aea2137092f51203

Reviewed-on: https://asterix-gerrit.ics.uci.edu/585

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <hubailmor@gmail.com>

  1. … 102 more files in changeset.
remove end-of-line whitespace

Change-Id: I5c0415f47d4c3a9827574fbdab949b45718d9ea4

Reviewed-on: https://asterix-gerrit.ics.uci.edu/601

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

  1. … 138 more files in changeset.
First stage of external data cleanup

In this change, different parts of external data were refactored.

The goal was to make it more modular, easier to maintain and allow

higher flexibility for extension in addition to reducing code redundancy.

Change-Id: I04a8c4e494d8d1363992b6fe0bdbe6b2b3b7b767

Reviewed-on: https://asterix-gerrit.ics.uci.edu/566

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <hubailmor@gmail.com>

    • -31
    • +28
    ./ExternalFileIndexAccessor.java
    • -0
    • +91
    ./FileIndexTupleTranslator.java
    • -0
    • +74
    ./FileOffsetIndexer.java
    • -0
    • +348
    ./IndexingScheduler.java
    • -0
    • +43
    ./RCRecordIdReader.java
    • -0
    • +84
    ./RecordColumnarIndexer.java
    • -0
    • +78
    ./RecordIdReader.java
    • -0
    • +38
    ./RecordIdReaderFactory.java
    • -97
    • +0
    ./dataflow/AbstractIndexingTupleParser.java
    • -239
    • +0
    ./dataflow/AdmOrDelimitedControlledTupleParser.java
    • -105
    • +0
    ./dataflow/AdmOrDelimitedIndexingTupleParser.java
    • -95
    • +0
    ./dataflow/FileIndexTupleTranslator.java
    • -140
    • +0
    ./dataflow/HDFSIndexingParserFactory.java
  1. … 329 more files in changeset.
Divide Cluster into Unique Partitions

The change includes the following:

- Fix passing NC stores to AsterixConfiguration.

- Unify storage direcotry name in the instance level rather than the node level.

- Divide the cluster into unique storage partitions based on the number of stores.

- Refactored FileSplits and moved out of AqlMetadataProvider.

- Make AsterixHyracksIntegrationUtil use the passed configuration file.

- Make File Splits pass relative index paths of partitions rather than absolute paths.

- Remove unused AqlCompiledMetadataDeclarations class.

Change-Id: I8c7fbca5113dd7ad569a46dfa2591addb5bf8655

Reviewed-on: https://asterix-gerrit.ics.uci.edu/564

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

    • -4
    • +1
    ./operators/ExternalDatasetIndexesAbortOperatorDescriptor.java
    • -6
    • +4
    ./operators/ExternalDatasetIndexesCommitOperatorDescriptor.java
    • -3
    • +1
    ./operators/ExternalDatasetIndexesRecoverOperatorDescriptor.java
  1. … 39 more files in changeset.
ASTERIXDB-54: s/IHyracksCommonContext/IHyracksTaskContext/

Change-Id: Id98f3d94e8036199dcbdbdb059c97c0f99ed9205

Reviewed-on: https://asterix-gerrit.ics.uci.edu/534

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <hubailmor@gmail.com>

    • -3
    • +4
    ./dataflow/AbstractIndexingTupleParser.java
    • -2
    • +2
    ./dataflow/AdmOrDelimitedIndexingTupleParser.java
    • -2
    • +2
    ./dataflow/HDFSIndexingParserFactory.java
    • -2
    • +2
    ./dataflow/HDFSObjectTupleParserFactory.java
    • -2
    • +3
    ./dataflow/RCFileIndexingTupleParser.java
    • -2
    • +2
    ./dataflow/TextOrSeqIndexingTupleParser.java
  1. … 6 more files in changeset.
Refactored External Data

This change re-arrange asterix module's order. asterix-

external-data is moved in front of asterix-metadata.

Change-Id: I46b60b5e1cc37fd59adc0dd89f374d96502091b2

Reviewed-on: https://asterix-gerrit.ics.uci.edu/559

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

    • -0
    • +133
    ./ExternalFile.java
    • -0
    • +151
    ./ExternalFileIndexAccessor.java
    • -0
    • +104
    ./FilesIndexDescription.java
    • -0
    • +205
    ./IndexingConstants.java
    • -17
    • +14
    ./dataflow/AdmOrDelimitedControlledTupleParser.java
    • -13
    • +18
    ./dataflow/FileIndexTupleTranslator.java
    • -102
    • +3
    ./dataflow/HDFSLookupAdapterFactory.java
    • -15
    • +9
    ./dataflow/RCFileControlledTupleParser.java
    • -13
    • +12
    ./dataflow/SeqOrTxtControlledTupleParser.java
    • -2
    • +2
    ./input/AbstractHDFSLookupInputStream.java
    • -1
    • +1
    ./input/GenericFileAwareRecordReader.java
    • -9
    • +7
    ./input/SequenceFileLookupInputStream.java
  1. … 79 more files in changeset.
Changed the IFrameWriter Contract

Updated existing operators to conform to the new contract.

These operators are either index operators or Feed operators.

The rest of the operator already follow the new contract.

Change-Id: Ibcebe876340a25be0f561945582a95211c140e10

Reviewed-on: https://asterix-gerrit.ics.uci.edu/552

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

    • -11
    • +21
    ./operators/ExternalIndexBulkModifyOperatorNodePushable.java
  1. … 8 more files in changeset.
Clean up compilation warnings.

Change-Id: Idbfcd9c67f91d373c5f7269125778a5681021227

Reviewed-on: https://asterix-gerrit.ics.uci.edu/505

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

    • -9
    • +4
    ./dataflow/HDFSIndexingParserFactory.java
    • -10
    • +10
    ./dataflow/HDFSLookupAdapter.java
    • -24
    • +21
    ./dataflow/IndexingScheduler.java
    • -6
    • +5
    ./input/AbstractHDFSLookupInputStream.java
    • -13
    • +11
    ./input/GenericFileAwareRecordReader.java
  1. … 101 more files in changeset.
ASTERIXDB-1102: VarSize Encoding to store length of String and ByteArray

This patch is to change the encoding format that stores the length value

of

the variable length type (e.g. String, ByteArray) from fix-size encoding

(2bytes) to variable-size encoding ( 1 to 5bytes)

It will solve the issue 1102 to enable us to store a String that longer

than 64K. Also for the common case of storing the short string ( <=

127), it will save one byte per string.

Some important changes include:

1. The UTF8StringSerDer and ByteArraySerDer is not Singleton instance

any more. I need some state to speedup the serialization and avoid the

object creatation. Luckily, 99% percent of Serializer were used as

factory way. The other 1% has been fixed.

A separate Test support, the ExcutionTest now can produce the only.xml

which stores the previous failed runtime test.xml. It can speedup the

debug process.

Change-Id: I41fff780f5c071742ef10129d83c8f945d5886d7

Reviewed-on: https://asterix-gerrit.ics.uci.edu/450

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Jianfeng Jia <jianfeng.jia@gmail.com>

    • -3
    • +4
    ./dataflow/FileIndexTupleTranslator.java
    • -2
    • +3
    ./operators/ExternalIndexBulkModifyOperatorNodePushable.java
  1. … 320 more files in changeset.
Change License Headers

Also tweak the NOTICE file with some extras.

Change-Id:I09bc388089e515d7f51fd39c31bfbbc9f00cf84f

Reviewed-on: https://asterix-gerrit.ics.uci.edu/388

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <tillw@apache.org>

    • -10
    • +14
    ./dataflow/AbstractIndexingTupleParser.java
    • -10
    • +14
    ./dataflow/AdmOrDelimitedControlledTupleParser.java
    • -10
    • +14
    ./dataflow/AdmOrDelimitedIndexingTupleParser.java
    • -10
    • +14
    ./dataflow/FileIndexTupleTranslator.java
    • -10
    • +14
    ./dataflow/HDFSIndexingParserFactory.java
    • -10
    • +14
    ./dataflow/HDFSLookupAdapter.java
    • -10
    • +14
    ./dataflow/HDFSLookupAdapterFactory.java
    • -10
    • +14
    ./dataflow/HDFSObjectTupleParser.java
    • -10
    • +14
    ./dataflow/HDFSObjectTupleParserFactory.java
    • -10
    • +14
    ./dataflow/HiveObjectParser.java
    • -10
    • +14
    ./dataflow/IAsterixHDFSRecordParser.java
    • -10
    • +14
    ./dataflow/IControlledTupleParser.java
    • -10
    • +14
    ./dataflow/IControlledTupleParserFactory.java
    • -10
    • +14
    ./dataflow/IndexingScheduler.java
    • -10
    • +14
    ./dataflow/RCFileControlledTupleParser.java
  1. … 2004 more files in changeset.
Change Java package from edu.uci.ics to org.apache

Change-Id: I2f01d2b5614e9e9c94fda4bf1294a8eba6a26c54

Reviewed-on: https://asterix-gerrit.ics.uci.edu/309

Reviewed-by: Till Westmann <tillw@apache.org>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

    • -15
    • +15
    ./dataflow/AbstractIndexingTupleParser.java
    • -20
    • +20
    ./dataflow/AdmOrDelimitedControlledTupleParser.java
    • -17
    • +17
    ./dataflow/AdmOrDelimitedIndexingTupleParser.java
    • -19
    • +19
    ./dataflow/FileIndexTupleTranslator.java
    • -13
    • +13
    ./dataflow/HDFSIndexingParserFactory.java
    • -21
    • +21
    ./dataflow/HDFSLookupAdapter.java
    • -38
    • +38
    ./dataflow/HDFSLookupAdapterFactory.java
    • -11
    • +11
    ./dataflow/HDFSObjectTupleParser.java
    • -7
    • +7
    ./dataflow/HDFSObjectTupleParserFactory.java
    • -17
    • +17
    ./dataflow/HiveObjectParser.java
    • -2
    • +2
    ./dataflow/IAsterixHDFSRecordParser.java
    • -3
    • +3
    ./dataflow/IControlledTupleParser.java
    • -1
    • +1
    ./dataflow/IControlledTupleParserFactory.java
    • -18
    • +18
    ./dataflow/RCFileControlledTupleParser.java
  1. … 2590 more files in changeset.
Change folder structure for Java repackage

Change only the folders, not the files, for our package name change.

This will break the build, and needs to be followed by a change to

the package name in all of the source files. However performing

the folder move and file change in two steps lets Git understand

that the files are the same, and lets us track revisions across

those files.

Change-Id: Iefd2a576415ebc1416cba2a3334d2b64f042ba92

Reviewed-on: https://asterix-gerrit.ics.uci.edu/306

Tested-by: Ian Maxon <imaxon@apache.org>

Reviewed-by: Till Westmann <tillw@apache.org>

    • -0
    • +92
    ./dataflow/AbstractIndexingTupleParser.java
    • -0
    • +238
    ./dataflow/AdmOrDelimitedControlledTupleParser.java
    • -0
    • +101
    ./dataflow/AdmOrDelimitedIndexingTupleParser.java
    • -0
    • +85
    ./dataflow/FileIndexTupleTranslator.java
    • -0
    • +141
    ./dataflow/HDFSIndexingParserFactory.java
    • -0
    • +183
    ./dataflow/HDFSLookupAdapter.java
    • -0
    • +178
    ./dataflow/HDFSLookupAdapterFactory.java
    • -0
    • +76
    ./dataflow/HDFSObjectTupleParser.java
    • -0
    • +65
    ./dataflow/HDFSObjectTupleParserFactory.java
    • -0
    • +420
    ./dataflow/HiveObjectParser.java
    • -0
    • +51
    ./dataflow/IAsterixHDFSRecordParser.java
    • -0
    • +40
    ./dataflow/IControlledTupleParser.java
    • -0
    • +19
    ./dataflow/IControlledTupleParserFactory.java
    • -0
    • +347
    ./dataflow/IndexingScheduler.java
    • -0
    • +199
    ./dataflow/RCFileControlledTupleParser.java
  1. … 3781 more files in changeset.