pregelix

Clone Tools
  • last updated 24 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
Added LSM component-level filters for all indexes.

Change-Id: I898cf885c9f88feae85c99799a00fd8ec036efea

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/81

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

  1. … 129 more files in changeset.
1. make limit operator work in nested plan 2. avoid send-side materialize but use receive-side materialize to reduce cc/nc communications.

Change-Id: I1a90c70be0514fcc0b286afa456664618d68910f

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/80

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Pouria Pirzadeh <pouria.pirzadeh@gmail.com>

  1. … 1 more file in changeset.
Support big vertex in Pregelix. --For those vertice beyond page size, we store them on HDFS as immutable files. --Updates on those big vertice will trigger creations of new immutable files.

Change-Id: I6b6f0528b6b5360c96dcdace1fa360d42c517f22

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/72

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Pouria Pirzadeh <pouria.pirzadeh@gmail.com>

    • -0
    • +23
    ./pregelix-example/data/skew/data.txt
  1. … 3 more files in changeset.
1. fix asterixdb issue 782 --- push nested pipeline before a nested group-by operator into the combiner group-by operator in the AbstractIntroduceGroupByCombinerRule --- add a processNullTest abstract method in the AbstractIntroduceGroupByCombinerRule -- fix the join order in a subplan 2. allow user-configurable buffer cache page size (B-tree page size) in Pregelix

commit 4d9a11d0c05281a41bbabe03066478fe851b3a2b

Author: buyingyi <buyingyi@gmail.com>

Change-Id: Ib7761370df8606c55ac34c126554319586e824f0

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/64

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <westmann@gmail.com>

  1. … 4 more files in changeset.
Several major changes in hyracks: -- reduced CC/NC communications for reporting partition request and availability; partition request/availability are only reported for the case of send-side materialized (without pipelining) policies in case of task re-attempt. -- changed buffer cache to dynamically allocate memory based on needs instead of pre-allocating -- changed each network channel to lazily allocate memory based on needs, and changed materialized connectors to lazily allocate files based on needs -- changed several major CCNCCFunctions to use non-java serde -- added a sort-based group-by operator which pushes group-by aggregations into an external sort -- make external sort a stable sort

1,3,and 4 is to reduce the job overhead.

2 is to reduce the unecessary NC resource consumptions such as memory and files.

5 and 6 are improvements to runtime operators.

One change in algebricks:

-- implemented a rule to push group-by aggregation into sort, i.e., using the sort-based gby operator

Several important changes in pregelix:

-- remove static states in vertex

-- direct check halt bit without deserialization

-- optimize the sort algorithm by packing yet-another 2-byte normalized key into the tPointers array

Change-Id: Id696f9a9f1647b4a025b8b33d20b3a89127c60d6

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/35

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <westmann@gmail.com>

  1. … 262 more files in changeset.
fixed issue 731, 740, and more

commit 8911cc529e72e2bb544d9b472d6e10f173d173af

Author: Young-Seok <kisskys@gmail.com>

Date: Sun May 18 11:28:28 2014 -0700

another fix for picking available index for leftouterjoin plan

commit 9bce43087615fee53613467a027833dd53e190f9

Merge: c8e85ac efab69f

Author: Young-Seok <kisskys@gmail.com>

Date: Sun May 11 22:22:10 2014 -0700

merged master to kisskys/left-outer-join-issue branch

commit c8e85aca31545c13b2a02ff6dc259943e2cf66ad

Author: Young-Seok <kisskys@gmail.com>

Date: Sun May 11 20:17:17 2014 -0700

changes for left-outer-join to pick available indexes

Change-Id: Ib0fc186bc9388802f95445edee92c428b3bb69cc

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/34

Reviewed-by: Inci Cetindil <icetindil@gmail.com>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 51 more files in changeset.
[maven-release-plugin] prepare for next development iteration

  1. … 79 more files in changeset.
[maven-release-plugin] prepare release fullstack-0.2.11

  1. … 79 more files in changeset.
fix for issue #732

  1. … 11 more files in changeset.
changes to fix issue 727

  1. … 11 more files in changeset.
Removed stale config; removed jvm.extraargs from surefire config

  1. … 1 more file in changeset.
Consolidate surefire config into top-level pom

    • -20
    • +0
    ./pregelix-dataflow-std-base/pom.xml
  1. … 27 more files in changeset.
[maven-release-plugin] prepare for next development iteration

  1. … 79 more files in changeset.
[maven-release-plugin] prepare release fullstack-0.2.10

  1. … 79 more files in changeset.
Added Maven Central repository explicitly

  1. … 3 more files in changeset.
fix conflict in comment

fix to support non-default pregelix cc http ports

compatibility for for bash versions <4.0

add (optional) CC_HTTPPORT and JOB_HISTORY_SIZE to conf

fix for linux setting

fix client dyn-opt setting

1. make startcc/nc scripts flexible for different physical memory size; 2. add dynamic optimization option in the Client

support heterogenous cluster

  1. … 23 more files in changeset.
use Counters as partial value to simplify HadoopCountersAggregator

add new example for Counters usage

add support for Hadoop Counters via job.setCounterAggregatorClass

The PregelixJob.setCounterAggregatorClass sets up a (user-specified)

global aggregator and an iterationComplete hook to save Counter values.

The user-specified Counter-based aggregator (must extend

HadoopCountersAggregator) is saved to HDFS in each iteration and should

be restart/snapshot-aware.

The usage for setting up counters is to make a call to

job.setCounterAggregatorClass. After job completion, the Counters may

be retrieved from HDFS using BspUtils.getCounters(job).

Note that there is currently only one spot for iterationComplete hooks

and this behavior occupies it.

add an "iteration complete" hook for aggregation/reporting across iterations

This commit allows the user to specify a class which will be called upon

completion of each pregelix iteration. This allows us to perform a user-

specified action between iterations.

As an example, a PerIterationGlobalAggregatesHook is provided which

saves the complete set of global aggregator states from every iteration,

allowing the user to observe aggregates from all iterations.

The default hook does nothing.

The hook instance is attached directly to the PregelixJob so that it can

be retrieved by the Driver's caller.

fix NPE when no custom aggregator is set

api for specifying update state for activate() and voteToHalt()

allow global aggregators to be specified in xml

explicitly setting the aggregator in the PregelixJob constructors would

override any values read in from the conf's resources.

Instead, this commit doesn't set the conf explicitly and instead

specifies an array of aggregator class names which will always be in

place when `getGlobalAggregatorClasses` is called.