Clone Tools
  • last updated 26 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
[NO ISSUE][STO] Add consistency to flush lifecycle

- user model changes: no

- storage format changes: yes

- renamed AbstractLSMIOOperationCallbackFactory

to LSMIOOperationCallbackFactory

- useless classes have been removed.

- LSMBTreeIOOperationCallbackFactory

- LSMBTreeWithBuddyIOOperationCallbackFactory

- LSMInvertedIndexIOOperationCallbackFactory

- LSMRTreeIOOperationCallbackFactory

- interface changes: yes

Details:

- Previously, flushes have different lifecycle depending

on the memory component state

- not allocated

- allocated

- modified

- In certain cases, flush operations are skipped alltogether

- IO Operation callbacks became complicated and difficult

to maintain since calls are done differently in different

cases.

- In certain cases, afterFinalize is called on the IO

Operation callbacks even if beforeOperation was never

called.

- In this change, flushes go through the same lifecycle

events regardless of the state of the memory component.

- In addition, primary and secondary memory components

would reside in different virtual buffer caches due

to skipped flushes, or due to having the secondary

index created when the primary index's memory component

is residing on the virtual buffer cache with index !=0.

- Moreover, when flushes are lagging and all memory

components are being flushed, search operations assumes

the oldest of the memory component is the newest and

produces incorrect results.

- In addition, in case of a failed flush of a component,

the IO scheduler would skip it and flush the next

component. This would produce a bad state on disk.

- In this change, a failed flush can be retried. otherwise,

all future flushes of the component fail due to the failure

of the previously failed flush.

- Previously, when a component fails to modify an index due

to flush failures, it assumes disk is full.

- With this change, the modification failure reports the

original cause of the failed flush.

Change-Id: I29f7992ec6c0f71c5b63d45800b2fb590d651e4b

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2584

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Tested-by: Murtadha Hubail <mhubail@apache.org>

    • -315
    • +0
    ./AbstractLSMIOOperationCallbackTest.java
    • -39
    • +0
    ./LSMBTreeIOOperationCallbackTest.java
    • -39
    • +0
    ./LSMBTreeWithBuddyIOOperationCallbackTest.java
    • -0
    • +217
    ./LSMIOOperationCallbackTest.java
    • -39
    • +0
    ./LSMInvertedIndexIOOperationCallbackTest.java
    • -39
    • +0
    ./LSMRTreeIOOperationCallbackTest.java
    • -0
    • +56
    ./TestFlushOperation.java
    • -0
    • +183
    ./TestLSMIndexAccessor.java
    • -12
    • +13
    ./TestLSMIndexOperationContext.java
  1. … 153 more files in changeset.
[ASTERIXDB-1952][TX][IDX] Filter logs pt.2

- user model changes: no

- storage format changes: yes

- interface changes: yes

Details:

- Add a log type specifically for filters

- Only log change when filter actually widens

- Stop logging of index + filter tuple during modification

- Redo index and filter tuples separately via their logs

Change-Id: Ie9e7795d9c8c212e8610dcb9bb5d26ec9fbbee8a

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1857

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Ian Maxon <imaxon@apache.org>

    • -1
    • +25
    ./TestLSMIndexOperationContext.java
  1. … 46 more files in changeset.
[NO ISSUE][STO] Improve the LSMIOOperationCallback interface.

- user model changes: no

- storage format changes: no

- interface changes: yes

+ ILSMIndexOperationContext.getIoOperationType()

+ ILSMIndexOperationContext.getNewComponent()

* before, after, and finalize

calls of ILSMIOOperationCallback now take

ILSMIndexOperationContext as a parameter

Details:

- Before, some calls to ILSMIOOperationCallback

take just an enum LSMIOOperationType, some of them

take an enum and a component object. These sometimes don't

provide enough information to different implementations of

the callback that might be interested in more than that.

- Having the operation context object passed allow for

better exchange of information between different callers

and callees throughout the IO operation.

Change-Id: Ib7120c40a1a2256ed528dfd2e5853db9dba247c6

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2455

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

    • -44
    • +68
    ./AbstractLSMIOOperationCallbackTest.java
    • -0
    • +177
    ./TestLSMIndexOperationContext.java
  1. … 17 more files in changeset.
[ASTERIXDB-2188] Ensure recovery of component ids

- user model changes: no

- storage format changes: yes.

Flush log record format changes.

- interface changes: no

Details:

- Add flush component ids to the flush log record. Upon

seeing a flush log record during recovery, schedule

a flush to all indexes in this partition s.t. LSN>maxDiskLSN

to ensure component ids are properly maintained upon

failed flushes.

- Add a test case to ensure the correctness of the recovery logic

of component ids

Change-Id: I8c1fc2b209cfb9d3dafa216771d2b7032eb99e75

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2408

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

    • -3
    • +5
    ./AbstractLSMIOOperationCallbackTest.java
  1. … 17 more files in changeset.
[NO ISSUE][STO] Introduce Index Checkpoints

- user model changes: no

- storage format changes: yes

- Add index checkpoints.

- Use index checkpoint to determine low watermark

during recovery.

- interface changes: yes

- Introduce IIndexCheckpointManager for managing

indexes checkpoints.

- Introduce IIndexCheckpointProvider for tracking

IIndexCheckpointManager references.

Details:

- Unify LSM flush/merge operations completion order.

- Introduce index checkpoints which contains:

- Index low watermark.

- Latest valid LSM component

- Mapping between master replica and local replica.

- Use index checkpoints instead of LSM component metadata

for identifying low watermark in recovery.

- Use index checkpoints in replication instead of overwriting

LSN byte offset in replica component metadata.

- Replace LSN_MAP used in replication by index checkpoints.

- Replace NIO Files.find by Commons FileUtils.listFiles to

avoid no NoSuchFileException on any file deletion.

Change-Id: Ib22800002bf8ea3660242e599b3f5f20678301a8

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2200

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <tillw@apache.org>

    • -25
    • +47
    ./AbstractLSMIOOperationCallbackTest.java
    • -2
    • +4
    ./LSMBTreeIOOperationCallbackTest.java
    • -2
    • +4
    ./LSMBTreeWithBuddyIOOperationCallbackTest.java
    • -2
    • +4
    ./LSMInvertedIndexIOOperationCallbackTest.java
    • -2
    • +4
    ./LSMRTreeIOOperationCallbackTest.java
  1. … 43 more files in changeset.
[ASTERIXDB-2161] Fix component id manage lifecycle

- user model changes: no

- storage format changes: no

- interface changes: yes. The interface of LMSIOOperationCallback

is changed

Details:

- The current way of management component ids is not correct,

in presence of that multiple partitions sharing the same primary op

tracker. It's possible when a partition is empty/being flushed,

the next flush is scheduled by another partition, which

will disturb the partition. This patch fixes this by

using the same logic of maintaining flushed LSNs to maintain

component id.

- Extend recycle memory component interface to indicate whether it

switches the new component or not.

- Also fixes [ASTERIXDB-2168] to ensure we do not miss latest flushed

LSNs by advancing io callback before finishing flush

Change-Id: Ifc35184c4d431db9af71cab302439e165ee55f54

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2153

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

    • -0
    • +267
    ./AbstractLSMIOOperationCallbackTest.java
    • -60
    • +7
    ./LSMBTreeIOOperationCallbackTest.java
    • -60
    • +7
    ./LSMBTreeWithBuddyIOOperationCallbackTest.java
    • -60
    • +7
    ./LSMInvertedIndexIOOperationCallbackTest.java
    • -60
    • +7
    ./LSMRTreeIOOperationCallbackTest.java
  1. … 10 more files in changeset.
[ASTERIXDB-2115] Add Component Ids to LSM Indexes

- user model changes: no

- storage format changes: no

- interface changes: yes

Details:

- Add LSMComponentId to all LSM components. Component Ids are managed

through IO operation callbacks.

- For memory component, it's ID is reset every time it's recycled.

- For disk component, it's ID is copied from the source component(s)

during flush/merge

- For indexes of a dataset, we need to guarantee all their memory

components should recieve the same ID. This is achieved using a shared

component Id generator.

- Fix memory component recycled callback, make sure it's called only

when we've indeed recycled the memory component

A design wiki for this patch: https://cwiki.apache.org/confluence/display/

ASTERIXDB/Component+Id-based+secondary-to-primary+index+acceleration

Change-Id: I8aec6261a84a0729ce35f4b1cb708be299ddb98d

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2025

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

    • -2
    • +5
    ./LSMBTreeIOOperationCallbackTest.java
    • -2
    • +5
    ./LSMBTreeWithBuddyIOOperationCallbackTest.java
    • -2
    • +5
    ./LSMInvertedIndexIOOperationCallbackTest.java
    • -2
    • +5
    ./LSMRTreeIOOperationCallbackTest.java
  1. … 59 more files in changeset.
[ASTERIXDB-1764][STO] Ensure LOAD follow same lifecycle with merge/flush

- user model changes: no

- storage format changes: no

- interface change: no

Details:

- Ensure ioOperationCallbacks are properly called for bulk loaded

component

- Add Load type to LSMIOOperationType to distinguish bulk loaded

component from flush component

- Change ILSMIOOperationCallback to use LSMIOOperationType instead of

LSMOperationType, because this callback only targets at LSM IO

operaitons

Change-Id: Ib9ecf7292c5dbaf8638d159decc6e6faf79de58b

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2131

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

    • -9
    • +9
    ./LSMBTreeIOOperationCallbackTest.java
    • -9
    • +9
    ./LSMBTreeWithBuddyIOOperationCallbackTest.java
    • -9
    • +9
    ./LSMInvertedIndexIOOperationCallbackTest.java
    • -9
    • +9
    ./LSMRTreeIOOperationCallbackTest.java
  1. … 19 more files in changeset.
[NO ISSUE][STO] Add a callback on recycling of memory components

- user model changes: no

- storage format changes: no

- interface change: yes

- ILSMIOOperationCallbackFactory.createIoOpCallback now takes

the ILSMIndex as a parameter.

- Remove ILSMIOOperationCallback.setNumOfMutableComponents

The callback can find out the number of mutable components

on instantiation since the lsm index is now passed.

- ILSMIOOperationCallback.allocated was added.

It gets called whenever a memory component is allocated.

- ILSMIOOperationCallback.recycled was added.

It gets called whenever a memory component is recycled.

- ILSMIndex.hasMemoryComponent is replaced with

ILSMIndex.getNumberOfMemoryComponents

Change-Id: I578ffd7ef17784034c94f3c0d23cd5094e39f6e0

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2126

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

    • -4
    • +7
    ./LSMBTreeIOOperationCallbackTest.java
    • -4
    • +7
    ./LSMBTreeWithBuddyIOOperationCallbackTest.java
    • -4
    • +7
    ./LSMInvertedIndexIOOperationCallbackTest.java
    • -4
    • +7
    ./LSMRTreeIOOperationCallbackTest.java
  1. … 102 more files in changeset.
ASTERIXDB-1917: FLUSH_LSN for disk components is not correctly set

-Fixed a bug that FLUSH_LSN for flushed disk components is not

correctly set (not increasing) when an NC has multiple partitions.

-Added LSMIOOperationCallback unit tests to cover this bug

Change-Id: If438e34f8f612458d81f618eea04c0c72c49a9fe

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1771

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

    • -0
    • +84
    ./LSMBTreeIOOperationCallbackTest.java
    • -0
    • +84
    ./LSMBTreeWithBuddyIOOperationCallbackTest.java
    • -0
    • +84
    ./LSMInvertedIndexIOOperationCallbackTest.java
    • -0
    • +84
    ./LSMRTreeIOOperationCallbackTest.java
  1. … 1 more file in changeset.