Move Pregelix and Hivesterix codebase to new repositories: 1. Move Pregelix codebase to; 2. Move Hivesterix codebase to .

1. fix asterixdb issue 782 --- push nested pipeline before a nested group-by operator into the combiner group-by operator in the AbstractIntroduceGroupByCombinerRule --- add a processNullTest abstract method in the AbstractIntroduceGroupByCombinerRule -- fix the join order in a subplan 2. allow user-configurable buffer cache page size (B-tree page size) in Pregelix

Author: buyingyi <>

Several major changes in hyracks: -- reduced CC/NC communications for reporting partition request and availability; partition request/availability are only reported for the case of send-side materialized (without pipelining) policies in case of task re-attempt. -- changed buffer cache to dynamically allocate memory based on needs instead of pre-allocating -- changed each network channel to lazily allocate memory based on needs, and changed materialized connectors to lazily allocate files based on needs -- changed several major CCNCCFunctions to use non-java serde -- added a sort-based group-by operator which pushes group-by aggregations into an external sort -- make external sort a stable sort

1,3,and 4 is to reduce the job overhead.

2 is to reduce the unecessary NC resource consumptions such as memory and files.

5 and 6 are improvements to runtime operators.

One change in algebricks:

-- implemented a rule to push group-by aggregation into sort, i.e., using the sort-based gby operator

Several important changes in pregelix:

-- remove static states in vertex

-- direct check halt bit without deserialization

-- optimize the sort algorithm by packing yet-another 2-byte normalized key into the tPointers array

1. add deployment retry 2. support plan switch

NodeControllers clean up appEntryPoints on shutdown (2nd try)

support multiple user-defined global aggregators

fix IIndexAccessor interface, add a boolean exclusiveMode parameter for the createSearchCursor method

fix file write race condition

fix in-place update

checkpoint: added support on running aggregation using group-by runtime. Aggregator interface is also updated in order to handle both accumulating and running aggregation.

fix fault-tolerance and error reporting to handle disk failures

support large-size global aggreate values

add job concatenation support

add runtime checks to improve pregelix debug-ability

Implemented k-buffering for lsm indexes. Also add a fix for issues 589 and 594.

merge from zheilbron/hyracks_msr

inherit Hadoop's VLongWritable format

a temp fix for size estimation


addd partition early termination support

address Vinayak's comments on open/close of vertex

add message overflow support

add LSM support in pregelix

add the support for customized partitioner

add/update license headers

reduce unnecessary B-Tree (non-inplace) update

add normalized key computer support in Pregelix

reintegrate fullstack_dynamic_deployment

Merged fullstack_lsm_staging upto r3336

reformat the code in in r3269

