Clone Tools
  • last updated 21 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
[ASTERIXDB-2422][STO] Introduce compressed storage

- user model changes: yes

- Add new configuration in the with-caluse to enable compression

- Add new nc configuration in the config file

- storage format changes: yes

- Pages of the primary index can be compressed

- Add a companion file (Look Aside File) with the compressed index

- Allow optional values in the LocalResource

- Add compression information in Metadata.Dataset

- interface changes: yes

- ICCApplicationContext:

- Add getCompressionManager()

- IBufferCache:

- Add getCompressedFileWriter(int fileId)

- ICachedPageInternal:

- Add setCompressedPageOffset(long offset)

- Add getCompressedPageOffset()

- Add setCompressedPageSize(int size)

- Add getCompressedPageSize()

Details:

- Add new integration test for this patch

- Fix ASTERIXDB-2464

- Add ddl-with-clause type validator

Additional details in the design document:

https://cwiki.apache.org/confluence/display/ASTERIXDB/Compression+in+AsterixDB

Change-Id: Idde6f37c810c30c7f1a5ee8bcbc1e3e5f4410031

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2857

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

    • -0
    • +66
    ./all_datasets_compressed/all_datasets_compressed.1.ddl.sqlpp
    • -0
    • +20
    ./all_datasets_compressed/all_datasets_compressed.10.post.http
    • -0
    • +20
    ./all_datasets_compressed/all_datasets_compressed.11.get.http
    • -0
    • +20
    ./all_datasets_compressed/all_datasets_compressed.12.get.http
    • -0
    • +35
    ./all_datasets_compressed/all_datasets_compressed.13.query.sqlpp
    • -0
    • +22
    ./all_datasets_compressed/all_datasets_compressed.14.query.sqlpp
    • -0
    • +22
    ./all_datasets_compressed/all_datasets_compressed.15.query.sqlpp
    • -0
    • +22
    ./all_datasets_compressed/all_datasets_compressed.16.query.sqlpp
    • -0
    • +24
    ./all_datasets_compressed/all_datasets_compressed.2.update.sqlpp
    • -0
    • +35
    ./all_datasets_compressed/all_datasets_compressed.3.query.sqlpp
    • -0
    • +20
    ./all_datasets_compressed/all_datasets_compressed.4.post.http
    • -0
    • +20
    ./all_datasets_compressed/all_datasets_compressed.5.get.http
    • -0
    • +20
    ./all_datasets_compressed/all_datasets_compressed.6.get.http
    • -0
    • +35
    ./all_datasets_compressed/all_datasets_compressed.7.query.sqlpp
    • -0
    • +22
    ./all_datasets_compressed/all_datasets_compressed.8.query.sqlpp
  1. … 177 more files in changeset.
[ASTERIXDB-2286][COMP][FUN][HYR] Parallel Sort Optimization

- user model changes: yes

- storage format changes: no

- interface changes: yes

details:

- new plan for sort operation which includes sampling and

replicating the stream of data to be sorted. Sort-merge connector

is removed from the plan. The sorted result now is in multiple partitions.

- new optimization rule to check whether full parallel sort is applicable.

- new Forward operator to read the replicated sort input stream and

to receive the ouput of the sampling.

- new sequential merge connector to merge a globally ordered result residing

in multiple partitions (in addition to the connector's partition computer).

- "asterix-lang-aql/pom.xml" is changed as a result of refactoring

code related to the range map handling.

- new private sampling function to generate the range map object

(local & global functions) & their type computers.

user model changes:

- new compiler property is added to enable and disable parallel sort.

interface changes:

- "ILogicalOperatorVisitor.java" includes Forward Operator.

- "ITuplePartitionComputer.java" includes initialize() to enable partitioner

to do some initialization. FieldRangePartitionComputerFactory uses it to

pick a range map.

- "ITuplePartitionComputerFactory.java". createPartitioner() is changed to

createPartitioner(IHyracksTaskContext hyracksTaskContext). Context is needed

for transferring the range map throught the context.

Change-Id: I73e128029a46f45e6b68c23dfb9310d5de10582f

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2393

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Dmitry Lychagin <dmitry.lychagin@couchbase.com>

    • -0
    • +2
    ./single_dataset_with_index/single_dataset_with_index.13.query.sqlpp
    • -0
    • +2
    ./single_dataset_with_index/single_dataset_with_index.8.query.sqlpp
  1. … 356 more files in changeset.
[ASTERIXDB-2170][SQL] Fix resolution order of implicit field access

- user model changes: yes

- storage format changes: no

- interface changes: no

Details:

- Improved name resolution rules

- Resolve field access to the nearest variable in scope

instead of raising compile-time error

- Do not rely on type information when resolving names

- Cleanup group variable handling in GroupBy clause,

no longer use ‘with’ map for it

- Fix ByNameToByIndexFieldAccessRule to use type environment

of its input operator when analyzing its expression

- Fix ExternalGroupByPOperator to use input schema of its

aggregate function when generating runtime for that function

- Fix invalid free variable computation for GroupBy clause

Change-Id: I50bc823ff53da06507a5454b30f4f500b862d4bf

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2207

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Dmitry Lychagin <dmitry.lychagin@couchbase.com>

    • -7
    • +7
    ./all_datasets/all_datasets.13.query.sqlpp
    • -7
    • +7
    ./all_datasets/all_datasets.3.query.sqlpp
    • -7
    • +7
    ./all_datasets/all_datasets.7.query.sqlpp
    • -7
    • +7
    ./single_dataverse/single_dataverse.13.query.sqlpp
    • -7
    • +7
    ./single_dataverse/single_dataverse.3.query.sqlpp
    • -7
    • +7
    ./single_dataverse/single_dataverse.7.query.sqlpp
  1. … 309 more files in changeset.
[ASTERIXDB-2050][SQL] Enforce a Semicolon After Each SQL++ Statement

- user model changes: a semicolon must be added after

every SQL++ statement.

- storage format changes: no

- interface changes: no

Details:

- Enforce a semicolon after each SQL++ statement.

- Adapt existing SQL++ test cases to new model.

Change-Id: I27e9e8fde5ff867ab569c8d443ba1522738046e3

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1954

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <tillw@apache.org>

    • -2
    • +2
    ./all_datasets/all_datasets.1.ddl.sqlpp
    • -1
    • +1
    ./duplicate_location/duplicate_location.1.ddl.sqlpp
    • -1
    • +1
    ./empty_location/empty_location.1.ddl.sqlpp
    • -1
    • +1
    ./identical_location/identical_location.1.ddl.sqlpp
    • -1
    • +1
    ./miss_dataverse/miss_dataverse.1.ddl.sqlpp
    • -1
    • +1
    ./single_dataset/single_dataset.1.ddl.sqlpp
    • -1
    • +1
    ./single_dataset_with_index/single_dataset_with_index.1.ddl.sqlpp
    • -2
    • +2
    ./single_dataverse/single_dataverse.1.ddl.sqlpp
  1. … 2070 more files in changeset.
Remove default node group.

In this way, CREATE DATASET statement can adjust to dynamic

cluster topology.

When we create a dataset:

- if the node group name is not given, we create a new node group

using all currently available nodes;

- if the node group name is give, we use the given node group for

the dataset.

When we drop a dataset:

- if no other dataset depends on the node group of the dataset to

be dropped, we also drop the node group.

Change-Id: If68dc6a7c1270ab1f5049c9334e3318425fd8287

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1799

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Michael Blow <mblow@apache.org>

    • -1
    • +1
    ./duplicate_location/duplicate_location.5.query.sqlpp
  1. … 51 more files in changeset.
Support rebalancing all datasets or a given dataverse.

Change-Id: Iad2740fd53b36bf122fd469beeca759d887e40fb

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1793

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

    • -0
    • +62
    ./all_datasets/all_datasets.1.ddl.sqlpp
    • -0
    • +20
    ./all_datasets/all_datasets.10.post.http
    • -0
    • +20
    ./all_datasets/all_datasets.11.get.http
    • -0
    • +20
    ./all_datasets/all_datasets.12.get.http
    • -0
    • +35
    ./all_datasets/all_datasets.13.query.sqlpp
    • -0
    • +22
    ./all_datasets/all_datasets.14.query.sqlpp
    • -0
    • +22
    ./all_datasets/all_datasets.15.query.sqlpp
    • -0
    • +22
    ./all_datasets/all_datasets.16.query.sqlpp
    • -0
    • +24
    ./all_datasets/all_datasets.2.update.sqlpp
    • -0
    • +35
    ./all_datasets/all_datasets.3.query.sqlpp
    • -0
    • +20
    ./all_datasets/all_datasets.4.post.http
    • -0
    • +20
    ./all_datasets/all_datasets.5.get.http
    • -0
    • +20
    ./all_datasets/all_datasets.6.get.http
    • -0
    • +35
    ./all_datasets/all_datasets.7.query.sqlpp
    • -0
    • +22
    ./all_datasets/all_datasets.8.query.sqlpp
  1. … 53 more files in changeset.
Support rebalancing datasets with indexes.

- Remove type arguments from the methods in IndexUtils

that generate index operation (e.g., create, load, compact)

jobs. Do type extraction inside SecondaryIndexOperationsHelper.

Change-Id: I9c0720382440ae44441a8f8847e75649a3822fa2

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1790

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Dmitry Lychagin <dmitry.lychagin@couchbase.com>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <tillw@apache.org>

    • -0
    • +49
    ./single_dataset_with_index/single_dataset_with_index.1.ddl.sqlpp
    • -0
    • +20
    ./single_dataset_with_index/single_dataset_with_index.10.get.http
    • -0
    • +25
    ./single_dataset_with_index/single_dataset_with_index.11.query.sqlpp
    • -0
    • +22
    ./single_dataset_with_index/single_dataset_with_index.12.query.sqlpp
    • -0
    • +25
    ./single_dataset_with_index/single_dataset_with_index.13.query.sqlpp
    • -0
    • +23
    ./single_dataset_with_index/single_dataset_with_index.2.update.sqlpp
    • -0
    • +25
    ./single_dataset_with_index/single_dataset_with_index.3.query.sqlpp
    • -0
    • +20
    ./single_dataset_with_index/single_dataset_with_index.4.post.http
    • -0
    • +20
    ./single_dataset_with_index/single_dataset_with_index.5.get.http
    • -0
    • +25
    ./single_dataset_with_index/single_dataset_with_index.6.query.sqlpp
    • -0
    • +22
    ./single_dataset_with_index/single_dataset_with_index.7.query.sqlpp
    • -0
    • +25
    ./single_dataset_with_index/single_dataset_with_index.8.query.sqlpp
    • -0
    • +20
    ./single_dataset_with_index/single_dataset_with_index.9.post.http
  1. … 23 more files in changeset.
Add a dataset rebalance REST API.

- Failures during rebalance are not handled;

- Indexes are not built for the rebalance target.

Change-Id: Ibda35252031fc4940972f0f19bbf796cadfa53d6

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1768

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Michael Blow <mblow@apache.org>

    • -0
    • +47
    ./duplicate_location/duplicate_location.1.ddl.sqlpp
    • -0
    • +23
    ./duplicate_location/duplicate_location.2.update.sqlpp
    • -0
    • +20
    ./duplicate_location/duplicate_location.3.post.http
    • -0
    • +22
    ./duplicate_location/duplicate_location.4.query.sqlpp
    • -0
    • +22
    ./duplicate_location/duplicate_location.5.query.sqlpp
    • -0
    • +47
    ./empty_location/empty_location.1.ddl.sqlpp
    • -0
    • +23
    ./empty_location/empty_location.2.update.sqlpp
    • -0
    • +20
    ./empty_location/empty_location.3.post.http
    • -0
    • +22
    ./empty_location/empty_location.4.query.sqlpp
    • -0
    • +47
    ./identical_location/identical_location.1.ddl.sqlpp
    • -0
    • +23
    ./identical_location/identical_location.2.update.sqlpp
    • -0
    • +20
    ./identical_location/identical_location.3.post.http
    • -0
    • +22
    ./identical_location/identical_location.4.query.sqlpp
    • -0
    • +20
    ./nonexist_dataset/nonexist_dataset.1.post.http
    • -0
    • +47
    ./single_dataset/single_dataset.1.ddl.sqlpp
  1. … 61 more files in changeset.