asterixdb

Clone Tools
  • last updated 27 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
Add Shutdown via API to Hyracks

This change adds a method to HyracksConnection called stopCluster().

When the CC recieves a message from this, it asks all NC tasks to close

and acknowledge that they have recieved the message and are closing.

If all NCs have closed, or a 10 second timeout elapses, the CC then

exits with a 0 return code if all NCs closed, or a 1 if some did

not acknowledge the shutdown request.

Change-Id: Iaf3d395dc7964e114d4929830f40063f58e0d5da

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/76

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Vinayak Borkar <vinayakb@gmail.com>

  1. … 9 more files in changeset.
1. make limit operator work in nested plan 2. avoid send-side materialize but use receive-side materialize to reduce cc/nc communications.

Change-Id: I1a90c70be0514fcc0b286afa456664618d68910f

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/80

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Pouria Pirzadeh <pouria.pirzadeh@gmail.com>

Fix ADM Parser Bug

This change make sure that each adm parser instance use its own objects for parsing

and not share it with other parser running on the same NC.

Change-Id: Ib54dd2f9f8474ddb8dc2d785f819dd62c7ce7ca3

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/73

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <westmann@gmail.com>

Support big vertex in Pregelix. --For those vertice beyond page size, we store them on HDFS as immutable files. --Updates on those big vertice will trigger creations of new immutable files.

Change-Id: I6b6f0528b6b5360c96dcdace1fa360d42c517f22

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/72

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Pouria Pirzadeh <pouria.pirzadeh@gmail.com>

    • -0
    • +23
    /pregelix/pregelix-example/data/skew/data.txt
  1. … 3 more files in changeset.
Add external indexes

This change include the following:

1. additional data parser for external data that parse hdfs records using Hive serdes.

2. allow users to create external data. this includes:

a) changes in metadata external dataset details.

b) addition of a new metadata index to store external file's statuses.

c) the pipeline for building the B-Tree and R-Tree indexes.

d) hyracks operators to fetch records with their RIDs using different formats.

e) hyracks operators to lookup and parse external records.

f) test cases for indexing and index access of different hdfs file formats.

g) exposing the secondary indexes over external data to the compiler.

3. adding a new aql command to refresh external datasets. this includes

a) global recovery on system startup.

b) changes in the aql parser.

c) construction of bulk modify pipelines and additional operators to perform local commit and abort operations (using 2PC protocol).

4. Added copyright header to all new files

5. Added additional test cases to test left outer join on external data

Change-Id: I1065a473299f6027eb073aeeba3a56d137f6f98e

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/70

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Ian Maxon <imaxon@uci.edu>

    • binary
    /asterix-app/data/hdfs/external-indexing-test.rc
    • binary
    /asterix-app/data/hdfs/external-indexing-test.seq
    • -0
    • +11
    /asterix-app/data/hdfs/external-indexing-test.txt
    • -0
    • +21
    /asterix-app/data/hdfs/spatialData.json
    • -0
    • +250
    /asterix-app/data/hdfs/tw_for_indexleftouterjoin.adm
  1. … 136 more files in changeset.
Adding external indexes

In Hyracks side, this change include the following:

1. The addition of three indexes:

a) external b-tree index

b) external r-tree index

c) external b-tree with buddy b-tree index

2. creating an additional logical operator in algebricks for performing lookup operations over external data and modify the different visitors to work with this operator

3. Added copyright header to all new files

Change-Id: Iecfbd86f06aff3caaf3a9652b63420666745ebb9

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/69

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Zachary Heilbron <zheilbron@gmail.com>

Reviewed-by: Sattam Alsubaiee <salsubaiee@gmail.com>

Reviewed-by: Ian Maxon <imaxon@uci.edu>

  1. … 51 more files in changeset.
1. Add an asterix-specific IntroduceGroupByCombinerRule to deal with null-test in the nested plan in a group-by operator 2. Add a regression test case for issue782, including optimizer test and runtime test

Change-Id: Ia678414451ebddb7367238fef9f22a6753aa6206

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/65

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <westmann@gmail.com>

    • -0
    • +11
    /asterix-app/data/tpch0.001/selectednation.tbl
1. fix asterixdb issue 782 --- push nested pipeline before a nested group-by operator into the combiner group-by operator in the AbstractIntroduceGroupByCombinerRule --- add a processNullTest abstract method in the AbstractIntroduceGroupByCombinerRule -- fix the join order in a subplan 2. allow user-configurable buffer cache page size (B-tree page size) in Pregelix

commit 4d9a11d0c05281a41bbabe03066478fe851b3a2b

Author: buyingyi <buyingyi@gmail.com>

Change-Id: Ib7761370df8606c55ac34c126554319586e824f0

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/64

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <westmann@gmail.com>

Hyracks issue #137 requires a new IUnnestPositionWriter to be defined. Here it is.

Updated AsterixDB to use the new IUnnestPositionWriter.

Change-Id: I9ad5dbaef7a3b347a61e0f8a5505d4db6dc232c3

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/68

Reviewed-by: Till Westmann <westmann@gmail.com>

Tested-by: Ian Maxon <imaxon@uci.edu>

The Hyracks issue #137 shows a hard coded value. The solution was to create a IUnnestPositionWriter to write the position variable in the form the type defined outside Hyracks.

The update attempts to make the fewest number of changes to add the writer.

Change-Id: I6ebab5b3dfabcd36c732067acefe33da22307fc7

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/67

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <westmann@gmail.com>

Reviewed-by: Preston Carman <ecarm002@ucr.edu>

Several major changes in hyracks: -- reduced CC/NC communications for reporting partition request and availability; partition request/availability are only reported for the case of send-side materialized (without pipelining) policies in case of task re-attempt. -- changed buffer cache to dynamically allocate memory based on needs instead of pre-allocating -- changed each network channel to lazily allocate memory based on needs, and changed materialized connectors to lazily allocate files based on needs -- changed several major CCNCCFunctions to use non-java serde -- added a sort-based group-by operator which pushes group-by aggregations into an external sort -- make external sort a stable sort

1,3,and 4 is to reduce the job overhead.

2 is to reduce the unecessary NC resource consumptions such as memory and files.

5 and 6 are improvements to runtime operators.

One change in algebricks:

-- implemented a rule to push group-by aggregation into sort, i.e., using the sort-based gby operator

Several important changes in pregelix:

-- remove static states in vertex

-- direct check halt bit without deserialization

-- optimize the sort algorithm by packing yet-another 2-byte normalized key into the tPointers array

Change-Id: Id696f9a9f1647b4a025b8b33d20b3a89127c60d6

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/35

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <westmann@gmail.com>

  1. … 262 more files in changeset.
reduced communication during result distribution - when reporting the location of results, the NCs also report if the result partition is empty - the client does not try to read empty partitions better toString() for subclasses of AbstractWork

Change-Id: Ia39f657e689ea305d49d55bd27c9a512e1ff970f

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/39

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

  1. … 10 more files in changeset.
Support sort-based group-by, add test coverage for out-of-core code paths, and adapt to the new buffer cache interface. -- add the support for sort-based group-by -- add test coverages for disk-based code path, including multi-pass code paths -- populate framesize and group-by buffer size into asterix -- adapt to new interface for buffer cache

Change-Id: I4af9eaa6fa6a8ae76b8ecaa39184785a90b32710

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/36

Tested-by: Ian Maxon <imaxon@uci.edu>

Reviewed-by: Till Westmann <westmann@gmail.com>

  1. … 25 more files in changeset.
start WorkQueue at the end of CC startup - should avoid issues like https://code.google.com/p/asterixdb/issues/detail?id=758

Change-Id: I209217ac1e923e7d2a22b6944c873236c62ef13b

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/38

Reviewed-by: Zachary Heilbron <zheilbron@gmail.com>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Raman Grover <ramang@uci.edu>

fixed issue 731, 740, and more.

commit 0b46141bea8d503896dc06308f102131df2e4f3d

Author: Young-Seok <kisskys@gmail.com>

Date: Tue May 20 12:52:54 2014 -0700

fixed issues of access method rules that try to use incompatible indexes and similarity functions

commit a0ea4e411503de265f1883aa3837a45be4a8747a

Merge: bb8fe91 b5785a9

Author: Young-Seok <kisskys@gmail.com>

Date: Sun May 18 13:00:33 2014 -0700

merged from master branch to kisskys/left-outer-join-issue branch

commit bb8fe91ffd4fec3d495d32442020447693be8548

Author: Young-Seok <kisskys@gmail.com>

Date: Sun May 18 11:33:54 2014 -0700

another fix for picking available index for leftouterjoin plan

commit 60b057ecec6a157e3e11cb316ef7d38601483741

Merge: a743e44 6cb7fd9

Author: Young-Seok <kisskys@gmail.com>

Date: Sun May 11 22:22:42 2014 -0700

merged master to kisskys/left-outer-join-issue branch

commit a743e4493f0f84f7a71e671478592d487e7510e3

Author: Young-Seok <kisskys@gmail.com>

Date: Sun May 11 20:51:50 2014 -0700

changes for left-outer-join to pick available indexes

Change-Id: I0d89d20c6cc076f40d1fbc5687f0b70e49a91eed

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/33

Reviewed-by: Inci Cetindil <icetindil@gmail.com>

Tested-by: Ian Maxon <imaxon@uci.edu>

  1. … 59 more files in changeset.
fixed issue 731, 740, and more

commit 8911cc529e72e2bb544d9b472d6e10f173d173af

Author: Young-Seok <kisskys@gmail.com>

Date: Sun May 18 11:28:28 2014 -0700

another fix for picking available index for leftouterjoin plan

commit 9bce43087615fee53613467a027833dd53e190f9

Merge: c8e85ac efab69f

Author: Young-Seok <kisskys@gmail.com>

Date: Sun May 11 22:22:10 2014 -0700

merged master to kisskys/left-outer-join-issue branch

commit c8e85aca31545c13b2a02ff6dc259943e2cf66ad

Author: Young-Seok <kisskys@gmail.com>

Date: Sun May 11 20:17:17 2014 -0700

changes for left-outer-join to pick available indexes

Change-Id: Ib0fc186bc9388802f95445edee92c428b3bb69cc

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/34

Reviewed-by: Inci Cetindil <icetindil@gmail.com>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 42 more files in changeset.
Added recordType deep copying to avoid race condition

Change-Id: Ia06e3114ffa3b593eedab0b9537e5f2b14abb8be

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/37

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Fixed issues 698 and 699. Also added JSON printers for UUIDs

The following commits from your working branch will be included:

commit 956ec767c369f4b238ad98b260b8fe83b1b5ea40

Author: zheilbron <zheilbron@gmail.com>

Date: Fri May 16 10:50:15 2014 -0700

fix issue 698

commit 8e0fd4d8ea6779e8ced4ee8001ccff70c7ac97ab

Author: zheilbron <zheilbron@gmail.com>

Date: Fri May 16 10:36:22 2014 -0700

add JSON printers for UUIDs

commit 2b550b3646255c6c9dca95d16cb78d075ec22205

Author: zheilbron <zheilbron@gmail.com>

Date: Fri May 16 10:10:41 2014 -0700

fix issue 699

Change-Id: I096505bdb5d4ab0f0dbbc46d15349a2c5682fe29

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/32

Reviewed-by: Inci Cetindil <icetindil@gmail.com>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

updated the test case

added a tets case, addressed Raman's code review comment

    • -4
    • +1
    /asterix-algebra/src/main/javacc/AQLPlus.jj
Adding support for accessing an item in list using non constant index

    • -14
    • +8
    /asterix-algebra/src/main/javacc/AQLPlus.jj
    • -14
    • +8
    /asterix-aql/src/main/javacc/AQL.jj
Basis for Cluster integration testing

This branch adds cluster testing via Vagrant.

Requires my branch of the vagrant-maven plugin to work,

which can be sourced here:

https://github.com/parshimers/vagrant-maven-plugin

It is enabled with -DclusterTest=true in mvn verify.

A virtualized cluster with 4 nodes is started, and

then Asterix is started via managix on this cluster,

and then stopped.

Change-Id: I7e3cdcd4162ada19ee1e15f532be7447b4f34367

Reviewed-on: http://fulliautomatix.ics.uci.edu:8443/31

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Ian Maxon <imaxon@uci.edu>

Reviewed-by: Zachary Heilbron <zheilbron@gmail.com>

Reviewed-by: Till Westmann <westmann@gmail.com>

Reviewed-by: Chris Hillery <ceej@lambda.nu>

    • -0
    • +52
    /asterix-installer/src/test/resources/clusterts/Vagrantfile
    • -0
    • +45
    /asterix-installer/src/test/resources/clusterts/cluster.xml
    • -0
    • +6
    /asterix-installer/src/test/resources/clusterts/hosts
    • -0
    • +27
    /asterix-installer/src/test/resources/clusterts/id_rsa
    • -0
    • +11
    /asterix-installer/src/test/resources/clusterts/known_hosts
addressed Young-Seok's code review comments

code review updates

Merge branch 'master' of https://code.google.com/p/asterixdb into icetindil/issue_765

fix for issue 765: stacking multiple index access methods

fixed failing optimizer test

optimization for disjunctive selection predicates - add rule to rewrite a disjunction of eq-comparisons on the same variable in a selection to a join

addressed code review comments

making edit-distance-contains() work with lists