Clone Tools
  • last updated 19 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
[ASTERIXDB-2540][STO] Optimize Storage Disk I/O

- user model changes: yes. Add a new storage option:

storage.disk.force.bytes (default 16MB),

- storage format changes: no.

- interface changes: yes.

Introduced IPageWriteCallback to LSM indexes

Details:

- Bypass all queuing (from BufferCache and IOManager) for disk writes.

This queuing is unnecessary but destroys fairness among multiple

writers.

- Introduce IPageWriteCallback to control the behavior of disk page

writes. Currently, this interface is used to perform disk forces

regularly for each writer thread.

Change-Id: I1f618dc7c186623e860239b4d97640fe3528e75b

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3285

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Michael Blow <mblow@apache.org>

  1. … 141 more files in changeset.
[ASTERIXDB-1130][STO]: JSON serialization for persisted objects

- user model changes: no

- storage format changes:

This change replaces the use of Java serialization for persisted objects

such as dataset/index metadata, checkpoints, ect...

This will break backward compatibly with any existing AsterixDB instance.

However, the change is needed to enable future backward compatibility support

for persisted objects.

- interface changes:

IJsonSerializable: contains API to serialize a class as a JsonNode.

IPersistedResourceRegistry: contains a mapping between an IJsonSerializable

class and a unique type id. An IPersistedResourceRegistry is responsible

for generating the class identifier in the JSON output.

The class identifier will always contain the following attributes:

@type: a unique type id that identifies the object type.

@version: the version of the serialized class.

@class: the serialized class full name.

Any registered class with PersistedResourceRegistry must provide

a static fromJson(IPersistedResourceRegistry, JsonNode) method for

deserialization. This is ensured during the class registration process.

Change-Id: I5b103e06eab6627dbfe9d531caae1a3ac4b296da

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2752

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Integration-Tests: Murtadha Hubail <mhubail@apache.org>

Tested-by: Murtadha Hubail <mhubail@apache.org>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 128 more files in changeset.
[NO ISSUE][STO] Add consistency to flush lifecycle

- user model changes: no

- storage format changes: yes

- renamed AbstractLSMIOOperationCallbackFactory

to LSMIOOperationCallbackFactory

- useless classes have been removed.

- LSMBTreeIOOperationCallbackFactory

- LSMBTreeWithBuddyIOOperationCallbackFactory

- LSMInvertedIndexIOOperationCallbackFactory

- LSMRTreeIOOperationCallbackFactory

- interface changes: yes

Details:

- Previously, flushes have different lifecycle depending

on the memory component state

- not allocated

- allocated

- modified

- In certain cases, flush operations are skipped alltogether

- IO Operation callbacks became complicated and difficult

to maintain since calls are done differently in different

cases.

- In certain cases, afterFinalize is called on the IO

Operation callbacks even if beforeOperation was never

called.

- In this change, flushes go through the same lifecycle

events regardless of the state of the memory component.

- In addition, primary and secondary memory components

would reside in different virtual buffer caches due

to skipped flushes, or due to having the secondary

index created when the primary index's memory component

is residing on the virtual buffer cache with index !=0.

- Moreover, when flushes are lagging and all memory

components are being flushed, search operations assumes

the oldest of the memory component is the newest and

produces incorrect results.

- In addition, in case of a failed flush of a component,

the IO scheduler would skip it and flush the next

component. This would produce a bad state on disk.

- In this change, a failed flush can be retried. otherwise,

all future flushes of the component fail due to the failure

of the previously failed flush.

- Previously, when a component fails to modify an index due

to flush failures, it assumes disk is full.

- With this change, the modification failure reports the

original cause of the failed flush.

Change-Id: I29f7992ec6c0f71c5b63d45800b2fb590d651e4b

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2584

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Tested-by: Murtadha Hubail <mhubail@apache.org>

    • -1
    • +1
    ./LSMIndexCompactOperatorNodePushable.java
  1. … 161 more files in changeset.
[NO ISSUE] Incremental cleanup of deprecated exception ctors

Change-Id: I1e7c3655828fc6530cef83ea502a6cfbf41acddf

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2533

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

    • -1
    • +1
    ./LSMIndexInsertUpdateDeleteOperatorNodePushable.java
  1. … 240 more files in changeset.
[NO ISSUE][STO] Adapt Storage Structure To Rebalance

- user model changes: no

- storage format changes: no

- interface changes: yes

-- Added IResource#setPath to use for the resource

storage migration.

Details:

- Unify storage structure to support dataset rebalance:

Old format:

./storage/partition_#/dataverse/datasetName_idx_indexName

New format:

./storage/partition_#/dataverse/datasetName/rebalanaceNum/indexName

- Adapt recovery and replication to new storage structure.

- Add old structure -> new structure NC migration task.

- Add CompatibilityUtil to ensure NC can be upgraded during

NC startup.

- Centralize the logic for parsing file path to its components in

ResourceReference/DatasetResourceReference.

- Add storage structure migration test case.

- Add test case for recovery after rebalance.

Change-Id: I0f968b9f493bf5aa2d49f503afe21f0d438bb7f0

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2181

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 40 more files in changeset.
[NO ISSUE][RT][IDX] Simplify index.createAccessor()

- user model changes: no

- storage format changes: no

- interface change: yes

(changed) IIndex, ILSMIndex

(new) IIndexAccessParameters

details:

- Refactor index.createAccessor() method to accept

an instance of IIndexAccessParameters as its parameter

since currently only ModificationCallBack and

SearchOperationCallback can be passed. If an accessor

needs to have additional parameters, there was no way

to pass them.

Change-Id: Iae015c342e830c81d666428447b595280139740e

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2120

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

    • -3
    • +2
    ./LSMIndexCompactOperatorNodePushable.java
  1. … 74 more files in changeset.
[NO ISSUE][STO] Make compact use the index io operation callback

- user model changes: no

- storage format changes: no

- interface changes: no

Change-Id: Ie06f480d729448de99dcce1912b72449940e4ea3

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2102

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Michael Blow <mblow@apache.org>

    • -7
    • +1
    ./LSMIndexCompactOperatorNodePushable.java
Change DataflowHelperFactory not to require Task Context

Change-Id: I9dcd95dbefca131c4bbdb43306f00f6f8ea60800

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1758

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

    • -1
    • +1
    ./LSMIndexCompactOperatorNodePushable.java
  1. … 19 more files in changeset.
Separate index build from index access

This change separates index build from index access.

All indexes now have a single dataflow helper which

uses the index path to locate the resource on the nc

to read the resource from memory or disk.

Existing resource metadata and dataflow helpers were

combined into resource builders eliminating lots of

duplicated code.

Change-Id: Ie4ea3aaa63dff8d246fa43ca7c7359729bc8cf47

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1728

Integration-Tests: Ian Maxon <imaxon@apache.org>

Tested-by: Ian Maxon <imaxon@apache.org>

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

    • -85
    • +0
    ./AbstractLSMIndexDataflowHelper.java
    • -66
    • +0
    ./AbstractLSMIndexDataflowHelperFactory.java
    • -8
    • +8
    ./LSMIndexCompactOperatorNodePushable.java
    • -6
    • +10
    ./LSMIndexInsertUpdateDeleteOperatorNodePushable.java
    • -22
    • +7
    ./LSMTreeIndexCompactOperatorDescriptor.java
    • -25
    • +17
    ./LSMTreeIndexInsertUpdateDeleteOperatorDescriptor.java
    • -0
    • +101
    ./LsmResource.java
    • -0
    • +74
    ./LsmResourceFactory.java
  1. … 564 more files in changeset.
Carry filter in 2ndary-to-primary index search

Change-Id: I287f1dbd230aa649f1350114abf0a1d47e2bb53c

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1720

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Luo Chen <cluo8@uci.edu>

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

  1. … 56 more files in changeset.
Introduce IStorageComponentProvider

Change-Id: If86750cdb2436c713f6598e54d4aaaf23d9f7bbf

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1451

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

    • -3
    • +3
    ./AbstractLSMIndexDataflowHelperFactory.java
    • -2
    • +2
    ./LSMTreeIndexCompactOperatorDescriptor.java
    • -2
    • +2
    ./LSMTreeIndexInsertUpdateDeleteOperatorDescriptor.java
  1. … 424 more files in changeset.
Remove Append Only Flag

Change-Id: Id5d6917db8ab29aa01521596f556006e25a502fe

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1385

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <hubailmor@gmail.com>

    • -4
    • +6
    ./LSMTreeIndexCompactOperatorDescriptor.java
    • -2
    • +3
    ./LSMTreeIndexInsertUpdateDeleteOperatorDescriptor.java
  1. … 212 more files in changeset.
Cleanup FileSplit and FileReference

This change gives FileSplit and FileReference specific meaning to

avoid confusion of an absolute vs relative, local vs global, inside

an IO device vs outside IO devices.

In addition, it enables better abstraction of global partitions and

delegate the responsibility of choosing which partition goes to which

IO device to the IO Manager through the introduction of FileDeviceComputer

In details:

Previously, the LocalResource in Hyracks had partition (storage partition)

and there is no such thing in Hyracks. This scope leak is bad. In addition

The local resource had a name and a path. they were always the same and so

the name was removed.

The storage partition was instead moved to asterixdb implementation of the

serialized object in the local resource.

With all of these changes, the cluster controller (compiler) only needs to

know about partitions and relative paths. It doesn't need to worry about

heterogenous Node setups and different io device configurations. For File

assignment to IO devices, a new interface (IFileDeviceComputer) was

introduced which can be overriden by applications to have their own

strategy for distributing files among IO devices.

Change-Id: I4fac508bf9af5a3bed41a3cf4464d2cbfecf2f61

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1352

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

    • -2
    • +3
    ./LSMIndexCompactOperatorNodePushable.java
    • -1
    • +2
    ./LSMIndexInsertUpdateDeleteOperatorNodePushable.java
    • -1
    • +2
    ./LSMTreeIndexCompactOperatorDescriptor.java
  1. … 284 more files in changeset.
ASTERIXDB-1228: Add MISSING into the data model.

1. MISSING repsents the value of a non-existing field in a record

or an out-of-bound index access of a collection;

2. NULL represents that the value of an optional field in a record

is unknown or the value of existing collection entry is unknown.

3. Unit tests for all missing/null-in-missing/null-out scalar functions.

Change-Id: Ia49ed8474bfc5d6604231819065117468c5b0897

Reviewed-on: https://asterix-gerrit.ics.uci.edu/846

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <tillw@apache.org>

    • -3
    • +3
    ./LSMTreeIndexInsertUpdateDeleteOperatorDescriptor.java
  1. … 551 more files in changeset.
Deadlock-free locking protocol is enabled

- Added EntityCommitProfiler class in TransactionSubsystem.java file:

This profiler takes a report interval (in seconds) parameter and

reports entity level commit count every report interval (in seconds)

only if IS_PROFILE_MODE is set to true. The profiler runs in a separate

thread. However, the profiler thread doesn't start reporting the count

until the entityCommitCount > 0. The profiler can be used to measure

1) IPS (Inserts Per Second) and

2) IIPS (instantaneous IPS) for the every report interval.

Change-Id: Ie58ae2f519baa53599e99b51bd61ea5f8366dafd

Reviewed-on: https://asterix-gerrit.ics.uci.edu/825

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <hubailmor@gmail.com>

    • -2
    • +5
    ./LSMIndexInsertUpdateDeleteOperatorNodePushable.java
  1. … 63 more files in changeset.
Move Hyracks to subfolder

    • -0
    • +77
    ./AbstractLSMIndexDataflowHelper.java
    • -0
    • +66
    ./AbstractLSMIndexDataflowHelperFactory.java
    • -0
    • +72
    ./LSMIndexCompactOperatorNodePushable.java
    • -0
    • +125
    ./LSMIndexInsertUpdateDeleteOperatorNodePushable.java
    • -0
    • +58
    ./LSMTreeIndexCompactOperatorDescriptor.java
    • -0
    • +71
    ./LSMTreeIndexInsertUpdateDeleteOperatorDescriptor.java
  1. … 4422 more files in changeset.