Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
[ASTERIXDB-2541][STO] Introduce GreedyScheduler

- user model changes: yes.

Add new option: storage.io.scheduler (async/greedy)

- storage format changes: no.

- interface changes: yes.

Introduce IIndexCursorStats

Details:

- Introduce GreedyScheduler that always executes the merge

operation with the smallest number of remaining pages to minimize

the number of disk components

- Introduce IIndexCursorStats to collect the statistics of index scans.

This allows GreedyScheduler to know the remaning pages of merge

operations.

- Extend AbstractIoOperation so that GreedyScheduler can pause/resume

merge operations if needed.

Change-Id: I38fe394d1180d4e3f6796064c0e6c6630b6ad303

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3284

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Michael Blow <mblow@apache.org>

    • -0
    • +133
    ./GreedySchedulerTest.java
  1. … 73 more files in changeset.
[ASTERIXDB-2600][STO] Introduce ConcurrentMergePolicy

- user model changes: yes. Add a new merge policy and make it as default

- storage format changes: no.

- interface changes: no.

Details:

- Introduce ConcurrentMergePolicy that performs concurrent merges

without the maximum component size.

- Make this merge policy as the default merge policy in AsterixDB since

the PrefixMergePolicy has made some wrong design decisions.

Change-Id: I2ed79847584b9fe846d62ad56ee094863538a2a2

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3463

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Luo Chen <cluo8@uci.edu>

Reviewed-by: Till Westmann <tillw@apache.org>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

    • -0
    • +190
    ./ConcurrentMergePolicyTest.java
  1. … 32 more files in changeset.
[NO ISSUE] Make IOManager more configurable

Change-Id: I1c8ad11c2b8b983ef4bf7cf78c2f068accddfff4

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3133

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Michael Blow <mblow@apache.org>

Contrib: Michael Blow <mblow@apache.org>

  1. … 10 more files in changeset.
[NO ISSUE] Apply / enforce java import order

The process-sources target will now sort imports as well as

format source code; the source-format job will likewise verify

import order in addition to source code format

Change-Id: I55d976c4df10d9919c6a25683be2a3e3304e65d9

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3288

Integration-Tests: Michael Blow <mblow@apache.org>

Tested-by: Michael Blow <mblow@apache.org>

Reviewed-by: Till Westmann <tillw@apache.org>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

    • -2
    • +0
    ./LSMComponentFilterReferenceTest.java
  1. … 625 more files in changeset.
[ASTERIXDB-2414][STO] Remove deleted component files from buffer cache

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- When activating an LSM index, we remove files of components

that were merged into a bigger component but not cleaned up yet.

- However, we sometimes leave a file reference mapped in the buffer

cache even when the file is removed from disk.

- This change ensures that all files are removed from the buffer

cache as well.

Change-Id: If0f11bc222662e4b50c1b47b1dfa6b30d1463b2e

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2822

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 7 more files in changeset.
[NO ISSUE][STO] Add consistency to flush lifecycle

- user model changes: no

- storage format changes: yes

- renamed AbstractLSMIOOperationCallbackFactory

to LSMIOOperationCallbackFactory

- useless classes have been removed.

- LSMBTreeIOOperationCallbackFactory

- LSMBTreeWithBuddyIOOperationCallbackFactory

- LSMInvertedIndexIOOperationCallbackFactory

- LSMRTreeIOOperationCallbackFactory

- interface changes: yes

Details:

- Previously, flushes have different lifecycle depending

on the memory component state

- not allocated

- allocated

- modified

- In certain cases, flush operations are skipped alltogether

- IO Operation callbacks became complicated and difficult

to maintain since calls are done differently in different

cases.

- In certain cases, afterFinalize is called on the IO

Operation callbacks even if beforeOperation was never

called.

- In this change, flushes go through the same lifecycle

events regardless of the state of the memory component.

- In addition, primary and secondary memory components

would reside in different virtual buffer caches due

to skipped flushes, or due to having the secondary

index created when the primary index's memory component

is residing on the virtual buffer cache with index !=0.

- Moreover, when flushes are lagging and all memory

components are being flushed, search operations assumes

the oldest of the memory component is the newest and

produces incorrect results.

- In addition, in case of a failed flush of a component,

the IO scheduler would skip it and flush the next

component. This would produce a bad state on disk.

- In this change, a failed flush can be retried. otherwise,

all future flushes of the component fail due to the failure

of the previously failed flush.

- Previously, when a component fails to modify an index due

to flush failures, it assumes disk is full.

- With this change, the modification failure reports the

original cause of the failed flush.

Change-Id: I29f7992ec6c0f71c5b63d45800b2fb590d651e4b

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2584

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Tested-by: Murtadha Hubail <mhubail@apache.org>

  1. … 161 more files in changeset.
[NO ISSUE][STO] Move the IO threads from BufferCache to IOManager

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

Move the IO threads from BufferCache to IOManager to cover all

IO uses that go through the IOManager.

Change-Id: Ic02b456826ae7abc2619a7eec3f90b48717b0adb

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2417

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Michael Blow <mblow@apache.org>

  1. … 17 more files in changeset.
[ASTERIXDB-2256] Reformat sources using code format template

Change-Id: I4faa141c1a8c9700d5e9ac50b839acc9d1eede73

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2310

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

    • -1
    • +1
    ./LSMComponentFilterReferenceTest.java
  1. … 983 more files in changeset.
[NO ISSUE][RT][IDX] Simplify index.createAccessor()

- user model changes: no

- storage format changes: no

- interface change: yes

(changed) IIndex, ILSMIndex

(new) IIndexAccessParameters

details:

- Refactor index.createAccessor() method to accept

an instance of IIndexAccessParameters as its parameter

since currently only ModificationCallBack and

SearchOperationCallback can be passed. If an accessor

needs to have additional parameters, there was no way

to pass them.

Change-Id: Iae015c342e830c81d666428447b595280139740e

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2120

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 74 more files in changeset.
[NO ISSUE][STO] Fix memory leaks in storage

- user model changes: no

- storage format changes: no

- interface changes: yes

- Added javadocs to:

-- IBufferCache

-- IExtraPageBlockHelper

- Moved IBufferCache.setPageDiskId -> ICachedPage.setDiskPageId

- Renamed:

-- IBufferCache.flushDirtyPage -> IBufferCache.flush

-- IBufferCache.getNumPages -> IBufferCache.getPageBudget

- Removed:

-- IBufferCache.adviseWontNeed [not used]

-- IBufferCache.tryPin [not used]

details:

- Previously, when adding a kv pair to the metadata of a memory

component, we add a new Pair item to the ArrayList. After

this change, we only update it if it exists.

- VirtualBufferCache used to leak pages when reclaiming pages

of a file after deletion. This has also been fixed.

- New tests for VirtualBufferCache added:

- Checks for memory budget after end of testDisjointPins

- Concurrent Users pinning pages concurrently

- Test for large pages and ensuring allocated large

pages are accounted for through removal of cached

free pages.

Change-Id: I4ae9736c9b5fdba5795245bdf835c023e3f73b15

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2115

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

    • -53
    • +267
    ./VirtualBufferCacheTest.java
  1. … 39 more files in changeset.
Revert "[ASTERIXDB-2103][STO] Too many disk components for CorrelatedPolicy"

This reverts commit 21ed0f72681a20ccb6a654f9aa4d54b8d0ea9c5c.

Change-Id: I670545acd09c678f21be25313353ab306be86202

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2063

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Ian Maxon <imaxon@apache.org>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 54 more files in changeset.
[ASTERIXDB-2103][STO] Too many disk components for CorrelatedPolicy

- user model changes: no

- storage format changes: no

- interface changes: yes

Details:

Currently CorrelatedMergePolicy uses component Ids to ensure disk

components of primary and secondary indexes are merged together,

but without synchronization. However, this results in too many disk

components for secondary InvertedIndex. The reason is that secondary

index could miss some round of merges, if the merge policy finds out

the corresponding secondary components are not available (either being

merged or being flushed). Even though flow-control on secondary indexes

can guarantee the secondary index would catch up the next time, it is

still possible that the primary component is finialized, which leaves

the secondary components which miss this round of merge are never merged

again.

This patch fixes this bug by:

- Add the mechanism of depending operations to LSM IO operation. An

operation finishes only after all depending operations have finished.

- For correlated merge policy, the flush/merge of the primary index depends

on all flushes/merges of secondary indexes. This ensures when the

correlated policy schedules merge, all related components of all indexes

are available to merge.

Change-Id: Ib6c06ee23f3bfd16b758802388389c00e29780b1

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2018

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Jianfeng Jia <jianfeng.jia@gmail.com>

  1. … 54 more files in changeset.
[NO ISSUE][STO] Component Deletes Through flushes and merges

- user model changes: no

- storage format changes: no

- interface changes: yes

- moved validation of component from the index:

- ILSMIndex and all of its implementations

to the component:

- ILSMDiskComponent and all of its implementations

details:

- This change enables component level deletes.

Change-Id: I178656207bfa1d15e6ae5ff2403a16df33940773

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2017

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 78 more files in changeset.
[ASTERIXDB-1945][STO] Cleanup Buffer Cache API

- user model changes: no

- storage format changes: no

- interface changes: yes

INcApplicationContext

- removed IFileMapProvider getFileMapManager();

to hide FileMapManager from other components;

IStorageManager

- IFileMapProvider getFileMapProvider(INCServiceContext ctx);

to hide FileMapManager from other components;

IFileHandle

- added FileReference getFileReference();

to avoid unnecessary casts;

IIOManager

- public void deleteWorkspaceFiles() throws

HyracksDataException;

added throws;

ILSMIndexFileManager

- void createDirs() throws HyracksDataException;

added throws;

IInvertedIndex

- added void purge() throws HyracksDataException;

a. InvertedIndexes don't implement the ITreeIndex interface.

b. when we deactivate a disk component, we need to purge it so

the buffer cache doesn't go through each page.

c. this need to be revisited, ASTERIXDB-1944

IFileMapManager

- int registerFile(FileReference fileRef) throws

HyracksDataException;

return value added for future reference of the index file

inside BufferCache or VirtualBufferCache;

- FileReference unregisterFile(int fileId) throws

HyracksDataException;

return value added for future refernece of the file;

IBufferCache

- int createFile(FileReference fileRef) throws

HyracksDataException;

return value added for future reference of the index file

inside BufferCache or VirtualBufferCache;

- void deleteFile(int fileId) throws HyracksDataException;

remove the dirty page flag since there's no dirty page;

- int openFile(FileReference fileRef) throws

HyracksDataException;

return value added for future reference of the index file

inside BufferCache or VirtualBufferCache;

- added void deleteFile(FileReference file) throws

HyracksDataException;

we used to have this public methods in both BufferCache

and VirtualBufferCache. Now we lifted it into the interface.

AbstractLSMIndex

- removed protected abstract void

destroyMemoryComponent(ILSMMemoryComponent c)

throws HyracksDataException;

It is because turned out when we deactivate, we actually

destroy them. However, because of the not well defined API,

double destroy was okay and so we used to do double destroy.

Details:

This change fixes the buffer cache to follow the API such that:

1. createFile creates the file.

2. deleteFile deletes the file.

3. openFile opens the file.

4. closeFile closes the file.

5. creates existing file is not allowed.

6. deletes deleted file is not allowed.

7. open non existing file is not allowed.

In addition, we hide the file map from all other components.

Change-Id: I0a973c2adb2e7fdcbbf18c7b888af3de5f0acc74

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1843

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Till Westmann <tillw@apache.org>

  1. … 160 more files in changeset.
Revert "ASTERIXDB-1945 [STO] Cleanup Buffer Cache API"

This reverts commit ae3daf6ef3397e583637360dc460c6391e03dc29.

Change-Id: I5e4e23f43a68e82c38fb8d1d7f4c0d01985c3a10

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1842

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Ian Maxon <imaxon@apache.org>

  1. … 160 more files in changeset.
ASTERIXDB-1945 [STO] Cleanup Buffer Cache API

Fix the buffer cache to follow the API such that:

1. createFile creates the file.

2. deleteFile deletes the file.

3. openFile opens the file.

4. closeFile closes the file.

5. creates existing file is not allowed.

6. deletes deleted file is not allowed.

7. open non existing file is not allowed.

In addition, we hide the file map from all other components.

Change-Id: I15565b07afdc94ac74c608bfe4480fa09dcf8f1c

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1840

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <hubailmor@gmail.com>

  1. … 160 more files in changeset.
Avoid always merging old components in prefix policy

Current, the prefix policy always looks at the components from

oldest to newest to schedule merge. One negative consequence is that

the oldest (largest) component gets merged over and over again

until it reaches the size limit. This is undesirable since it takes

O(n^2) disk IOs (n is the number of flushed components) to produce a

final component.

This patch is a temporary fix of this behavior, taken from the idea of

HBase compaction policy (https://www.ngdata.com/visualizing-hbase

-flushes-and-compactions/). The basic idea is that it introduces

some size factor (for now it's 1.2) to control the merge behavior.

When the prefix policy finds a sequence of components to merge,

we also check the oldest (largest) component in the sequence should

be smaller than 1.2*the total size of all younger components.

By doing so, we can avoid merging oldest components over and over again,

making the disk IOs O(nlog n).

Change-Id: I464da3fed38cded0aee7b319a35664eae069a2ba

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1818

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Yingyi Bu <buyingyi@gmail.com>

    • -0
    • +232
    ./PrefixMergePolicyTest.java
  1. … 2 more files in changeset.
Fix upsert deadlock and upsert with filtered primary only

This change fixes a deadlock that happens when 3 operations

an upsert, a search and a flush happen simulteniously.

If all the memory components are full, the upsert

gets blocked, the upsert could've obtained a lock on the

search key which would block the search not allowing it

to exit the components and not allowing the components

to be cleared and reused.

In addition, the change refactors common LSM index code.

Change-Id: I93fac0f27ab0b3cc071ff38aef90d850cbbce488

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1762

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <hubailmor@gmail.com>

    • -0
    • +63
    ./LSMComponentFilterReferenceTest.java
    • -0
    • +267
    ./LSMIndexFileManagerTest.java
    • -0
    • +139
    ./VirtualBufferCacheTest.java
    • -0
    • +73
    ./VirtualFreePageManagerTest.java
  1. … 83 more files in changeset.